Solubilized enzyme and uses thereof

ABSTRACT

The present invention relates to mixtures comprising a polypeptide or a plurality of polypeptides having biomass-degrading activity that is solubilized from an inclusion body, and retaining biomass-degrading activity, and methods for producing and using the same. The invention described herein provides methods for increasing the yield of recombinant protein with biomass-degrading activity that can be isolated from host cells.

RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.14/782,205, filed Oct. 2, 2015, which is a national stage applicationunder 35 U.S.C. § 371 of International Application No.PCT/US2015/052200, filed Sep. 25, 2015, which claims the benefit of U.S.Provisional Application No. 62/055,702, filed Sep. 26, 2014; the entirecontents of each of which are hereby incorporated by reference.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy, created on Sep. 1, 2015, isnamed X2002-7003WO_SL.txt and is 76,946 bytes in size.

FIELD OF THE INVENTION

The present invention relates generally to mixtures comprising apolypeptide having biomass-degrading activity solubilized from inclusionbodies and having biomass-degrading activity, and methods for producingthe mixtures described herein. The present invention also providesmethods for using such mixtures, e.g., to process biomass materials.

BACKGROUND OF THE INVENTION

Biomass-degrading enzymes, such as cellulases, xylanases, andligninases, are important for the degradation of biomass, such asfeedstock. Cellulosic and lignocellulosic materials are produced,processed, and used in large quantities in a number of applications.Often such materials are used once, and then discarded as waste, or aresimply considered to be wasted materials, e.g., sewage, bagasse,sawdust, and stover.

SUMMARY OF THE INVENTION

High level of expression of recombinant proteins in host cells such asE. coli can lead to accumulation of the recombinant proteins intoinsoluble aggregates within the host cell. These insoluble aggregatesare called inclusion bodies and can also contain other components, suchas proteins endogenous to the host cell, ribosomal components, nucleicacids, and cellular debris. Solubilization of the recombinant proteinsfrom the inclusion bodies can be achieved through treatment with highconcentrations of a solubilizing agent such as urea, which disruptshydrogen bonds and hydrophobic interactions. However, treatment with asolubilizing agent, such as urea, can result in denaturation of theprotein and loss of enzymatic activity. Thus, the aggregation ofrecombinant proteins into inclusion bodies can reduce the yield ofrecombinant protein with enzymatic activity that can be isolated fromthe host cells.

The present invention is based, at least in part, on the surprisingdiscovery that a heterologously expressed cellobiase that has beensolubilized from inclusion bodies by solubilizing agent, such as urea,retains cellobiase activity. Therefore, the methods described herein forsolubilization of heterologously expressed cellobiase, or otherbiomass-degrading enzymes, are useful for increasing the yield of theheterologously expressed enzymes having biomass-degrading activity,e.g., by 30-40%. Furthermore, the presence of the solubilizing agent,e.g., urea, from the addition of the solubilized biomass-degradingenzyme, e.g., cellobiase, does not adversely affect the saccharificationreaction for converting biomass to a sugar product and/or the yield ofproducts.

Accordingly, in one aspect, the disclosure features a mixture comprisinga polypeptide or a plurality of polypeptides having a biomass-degradingactivity and a solubilizing agent, e.g., urea, wherein the polypeptideor plurality thereof has at least 8-10% biomass-degrading activitycompared the native polypeptide.

In one embodiment, the mixture further comprises one or more proteinsassociated with an inclusion body. Alternatively, in one embodiment, themixture does not comprise one or more proteins associated with aninclusion body. In one embodiment, the mixture further comprisescellular debris, one or more ribosomal component, one or more hostprotein, e.g., protein endogenously expressed by the host cell, and/orhost nucleic acid, e.g., DNA and/or RNA.

In one embodiment, the biomass-degrading activity is cellobiaseactivity, ligninase activity, endoglucanase activity, cellobiohydrolaseactivity, or xylanase activity.

In one embodiment, the polypeptide is partially unfolded, partiallymisfolded, or partially denatured.

In another aspect, the disclosure features a mixture comprising apolypeptide or a plurality of polypeptides having an amino acid sequencewith at least 90% identity to SEQ ID NO: 1 and a solubilizing agent,e.g., urea, wherein the polypeptide or plurality thereof has at least20% of the activity of the native polypeptide, e.g., SEQ ID NO: 1 orCel3a from T. reesei. For example, the mixture further comprises one ormore proteins associated with an inclusion body. Alternatively, themixture does not comprise one or more proteins associated with aninclusion body. The mixture may further comprise one or more of thefollowing: cellular debris, one or more ribosomal component, one or morehost protein, e.g., protein endogenously expressed by the host cell,and/or host nucleic acid, e.g., DNA and/or RNA. The polypeptide with atleast 90% identity to SEQ ID NO: 1 may be partially unfolded, partiallymisfolded, or partially denatured.

In one embodiment, the polypeptide comprises an amino acid sequence withat least 90% identity to SEQ ID NO: 1. In one embodiment, thepolypeptide comprises a Cel3A enzyme from T. reesei, or a functionalvariant or fragment thereof. In one embodiment, the Cel3A enzymecomprises (e.g., consists of) the amino acid sequence SEQ ID NO: 1. Inone embodiment, the polypeptide is encoded by a nucleic acid sequencecomprising (e.g., consisting of) at least 90% identity to SEQ ID NO: 2or SEQ ID NO: 3.

In one embodiment, the polypeptide is aglycosylated.

In one embodiment, the solubilizing agent, e.g., urea, is present in themixture at a concentration between 0.2M-6M.

In one embodiment, the mixture further comprises at least one additionalpolypeptide having a biomass-degrading activity or a microorganism thatproduces one or more enzymes having a biomass-degrading activity. In oneembodiment, the additional polypeptide is selected from a ligninase, anendoglucanase, a cellobiohydrolase, a cellobiase, and a xylanase, or anycombination thereof. In one embodiment, the additional polypeptide isselected from:

-   -   a. a polypeptide comprising (e.g., consisting of) an amino acid        sequence with at least 90% identity to SEQ ID NO: 1;    -   b. a Cel3A enzyme from T. reesei, or a functional variant or        fragment thereof; or    -   c. a polypeptide encoded by a nucleic acid sequence comprising        (e.g., consisting of) SEQ ID NO: 2 or SEQ ID NO: 3.

In one embodiment, the additional polypeptide is aglycosylated.

In one embodiment, the additional polypeptide is glycosylated.

In one aspect, the disclosure features a method for producing a mixturedescribed herein comprising a polypeptide having biomass-degradingactivity, one or more proteins associated with an inclusion body, and asolubilizing agent, e.g., urea, wherein the method comprises contactinga cell expressing the polypeptide having biomass-degrading activity, orlysate thereof, with a solubilizing agent, e.g., urea, at aconcentration suitable for solubilizing the polypeptide. In oneembodiment, the method further comprises lysing the cell to obtain alysate, separating a soluble fraction from an insoluble fraction of thelysate, and resuspending the insoluble fraction in the solubilizingagent, e.g., urea. In one embodiment, the concentration of thesolubilizing agent, e.g., urea, is between 0.2M-6M, e.g., 6M.

In one embodiment, the biomass-degrading activity is a cellobiaseactivity, a ligninase activity, an endoglucanase activity, acellobiohydrolase, or a xylanase activity.

In one embodiment, the polypeptide comprises an amino acid sequence withat least 90% identity to SEQ ID NO: 1. In one embodiment, thepolypeptide comprises a Cel3A from T. reesei, or a functional variant orfragment thereof.

In one embodiment, the polypeptide is aglycosylated.

In one aspect, the disclosure features a method for producing apolypeptide having a biomass-degrading activity comprising expressingthe polypeptide in a cell and contacting the cell or a lysate thereofwith a solubilizing agent, e.g., urea, at a concentration suitable forsolubilizing the polypeptide.

In another aspect, the disclosure features a method for producing apolypeptide having biomass-degrading activity comprising providing acell that has been genetically modified to produce at least onepolypeptide having biomass-degrading activity, wherein at least aportion of said polypeptide having biomass-degrading activity is foundin inclusion bodies, and contacting the cell, or a lysate thereofcontaining the inclusion bodies, with a solubilizing agent, e.g., urea,at a concentration suitable for solubilizing the polypeptide.

In one embodiment, the methods disclosed herein further comprise lysingthe cell to obtain a lysate, separating a soluble fraction from aninsoluble fraction of the lysate, and resuspending the insolublefraction in the solubilizing agent, e.g., urea. In one embodiment, theconcentration of the solubilizing agent, e.g., urea, is between 0.2M-6M,e.g., 6M.

In one embodiment, the biomass-degrading activity is a cellobiaseactivity, a ligninase activity, an endoglucanase activity, acellobiohydrolase activity, or a xylanase activity.

In one embodiment, the aglycosylated polypeptide comprises (e.g.,consisting of) an amino acid sequence with at least 90% identity to SEQID NO: 1. In one embodiment, the aglycosylated polypeptide comprises aCel3A from T. reesei, or a functional variant or fragment thereof.

In one embodiment, the cell is a prokaryotic or bacterial cell, e.g., E.coli cell, origami E. coli cell.

In one embodiment, the polypeptide is aglycosylated.

In one aspect, the disclosure features a method of producing a product(e.g., hydrogen, a sugar, an alcohol) from a biomass (or converting abiomass to a product) comprising contacting a biomass with the mixturedescribed herein comprising a polypeptide having biomass-degradingactivity, one or more proteins associated with an inclusion body, and asolubilizing agent, e.g., urea, and, optionally, with a microorganismthat produces one or more biomass-degrading enzyme and/or an enzymemixture comprising biomass-degrading enzymes, under conditions suitablefor the production of the product.

In one embodiment, the method further comprises a step of treating thebiomass with an electron beam prior to contacting the biomass with themixture described herein comprising a polypeptide havingbiomass-degrading activity, one or more proteins associated with aninclusion body, and a solubilizing agent, e.g., urea.

In one embodiment, the product is a sugar product. In one embodiment,the sugar product is glucose and/or xylose.

In one embodiment, the method further comprises a step of isolating theproduct. In one embodiment, the step of isolating the product comprisesprecipitation, crystallization, chromatography, centrifugation, and/orextraction.

In one embodiment, the enzyme mixture comprises at least two of theenzymes selected from B2AF03, CIP1, CIP2, Cel1a, Cel3a, Cel5a, Cel6a,Cel7a, Cel7b, Cel12a, Cel45a, Cel74a, paMan5a, paMan26a, Swollenin.

In one embodiment, the biomass comprises starchy materials, sugar cane,agricultural waste, paper, paper product, paper waste, paper pulp,pigmented papers, loaded papers, coated papers, filled papers,magazines, printed matter, printer paper, polycoated paper, card stock,cardboard, paperboard, cotton, wood, particle board, forestry wastes,sawdust, aspen wood, wood chips, grasses, switchgrass, miscanthus, cordgrass, reed canary grass, grain residues, rice hulls, oat hulls, wheatchaff, barley hulls, agricultural waste, silage, canola straw, wheatstraw, barley straw, oat straw, rice straw, jute, hemp, flax, bamboo,sisal, abaca, corn cobs, corn stover, soybean stover, corn fiber,alfalfa, hay, coconut hair, sugar processing residues, bagasse, beetpulp, agave bagasse, algae, seaweed, manure, sewage, offal, agriculturalor industrial waste, arracacha, buckwheat, banana, barley, cassava,kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans,favas, lentils, peas, or any combination thereof.

In one embodiment, the biomass comprises a starchy material or a starchymaterial that includes a cellulosic component. In some embodiments, thebiomass comprises one or more of an agricultural product or waste, apaper product or waste, a forestry product, or a general waste, or anycombination thereof; wherein: a) an agricultural product or wastecomprises sugar cane jute, hemp, flax, bamboo, sisal, alfalfa, hay,arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago,sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas,grasses, switchgrass, miscanthus, cord grass, reed canary grass, grainresidues, canola straw, wheat straw, barley straw, oat straw, ricestraw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp,bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheatchaff, barley hulls, or beeswing, or a combination thereof; b) a paperproduct or waste comprises paper, pigmented papers, loaded papers,coated papers, filled papers, magazines, printed matter, printer paper,polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or acombination thereof; c) a forestry product comprises aspen wood,particle board, wood chips, or sawdust, or a combination thereof; and d)a general waste comprises manure, sewage, or offal, or a combinationthereof.

In one embodiment, the method further comprises a step of treating thebiomass prior to introducing the microorganism or the enzyme mixture toreduce the recalcitrance of the biomass, e.g., by treating the biomasswith bombardment with electrons, sonication, oxidation, pyrolysis, steamexplosion, chemical treatment, mechanical treatment, and/or freezegrinding.

In one embodiment, the microorganism that produces a biomass-degradingenzyme is from species in the genera selected from Bacillus, Coprinus,Myceliophthora, Cephalosporium, Scytalidium, Penicillium, Aspergillus,Pseudomonas, Humicola, Fusarium, Thielavia, Acremonium, Chrysosporium orTrichoderma. In one embodiment, the microorganism that produces abiomass-degrading enzyme is selected from Aspergillus, Humicola insolens(Scytalidium thermophilum), Coprinus cinereus, Fusarium oxysporum,Myceliophthora thermophila, Meripilus giganteus, Thielavia terrestris,Acremonium persicinum, Acremonium acremonium, Acremonium brachypenium,Acremonium dichromosporum, Acremonium obclavatum, Acremoniumpinkertoniae, Acremonium roseogriseum, Acremonium incoloratum,Acremonium furatum, Chrysosporium lucknowense, Trichoderma viride,Trichoderma reesei, or Trichoderma koningii.

In one embodiment, the microorganism has been induced to producebiomass-degrading enzymes by combining the microorganism with aninduction biomass sample under conditions suitable for increasingproduction of biomass-degrading enzymes compared to an uninducedmicroorganism. In one embodiment, the induction biomass sample comprisesstarchy materials, sugar cane, paper, paper products, paper waste, paperpulp, pigmented papers, loaded papers, coated papers, filled papers,magazines, printed matter, printer paper, polycoated paper, card stock,cardboard, paperboard, cotton, wood, particle board, forestry wastes,sawdust, aspen wood, wood chips, grasses, switchgrass, miscanthus, cordgrass, reed canary grass, grain residues, rice hulls, oat hulls, wheatchaff, barley hulls, agricultural waste, silage, canola straw, wheatstraw, barley straw, oat straw, rice straw, jute, hemp, flax, bamboo,sisal, abaca, corn cobs, corn stover, soybean stover, corn fiber,alfalfa, hay, coconut hair, sugar processing residues, bagasse, beetpulp, agave bagasse, algae, seaweed, manure, sewage, offal, agriculturalor industrial waste, arracacha, buckwheat, banana, barley, cassava,kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans,favas, lentils, peas, or any combination thereof.

In one embodiment, the induction biomass comprises a starchy material ora starchy material that includes a cellulosic component. In someembodiments, the induction biomass comprises one or more of anagricultural product or waste, a paper product or waste, a forestryproduct, or a general waste, or any combination thereof; wherein: a) anagricultural product or waste comprises sugar cane jute, hemp, flax,bamboo, sisal, alfalfa, hay, arracacha, buckwheat, banana, barley,cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams,beans, favas, lentils, peas, grasses, switchgrass, miscanthus, cordgrass, reed canary grass, grain residues, canola straw, wheat straw,barley straw, oat straw, rice straw, corn cobs, corn stover, corn fiber,coconut hair, beet pulp, bagasse, soybean stover, grain residues, ricehulls, oat hulls, wheat chaff, barley hulls, or beeswing, or acombination thereof; b) a paper product or waste comprises paper,pigmented papers, loaded papers, coated papers, filled papers,magazines, printed matter, printer paper, polycoated paper, cardstock,cardboard, paperboard, or paper pulp, or a combination thereof; c) aforestry product comprises aspen wood, particle board, wood chips, orsawdust, or a combination thereof; and d) a general waste comprisesmanure, sewage, or offal, or a combination thereof.

In one embodiment, the present invention provides advantages to currentmethods used in the art. These advantages include providing access toinsoluble enzymes that would normally be discarded, increasing the yieldof desired proteins that retain enzyme activity, purified enzymes forcleaner downstream processing, and organism selection (e.g., increaseavailability of organisms that may have been previously excluded fromuse due to propensity to develop inclusion bodies).

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a chromatogram showing the results of IMAC purification ofsolubilized Cel3a. The purified solubilized Cel3a peak is indicated bythe arrow.

FIG. 2 is a picture of an SDS-PAGE gel showing the proteins in differentfractions of the IMAC purification. Lane 1 shows the molecular weightstandards. Lane 2 shows purified Cel3a from the soluble fraction. Lane 3shows the flow through from IMAC purification of the insoluble fraction.Lane 4 shows the purified solubilized Cel3a from the insoluble fraction.

FIG. 3 is a graph comparing the cellobiase activity of purified solubleCel3a and purified solubilized Cel3a from the insoluble fraction.

FIG. 4 is a graph comparing the cellobiase activity of purified solubleCel3a, the wash fraction of the insoluble fraction, and Cel3asolubilized from the insoluble fraction without purification.

DETAILED DESCRIPTION Definitions

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which the invention pertains.

The term “a” and “an” refers to one or to more than one (i.e., to atleast one) of the grammatical object of the article. By way of example,“an element” means one element or more than one element.

The term “aglycosylated”, as used herein, refers to a molecule, e.g., apolypeptide, that is not glycosylated (i.e., it comprises a hydroxylgroup or other functional group that is not attached to a glycosylategroup) at one or more sites which has a glycan attached when themolecule is produced in its native environment. In some embodiments, theaglycosylated molecule does not have any attached glycans. In oneembodiment, the molecule has been altered or mutated such that themolecule cannot be glycosylated, e.g., one or more glycosylation site ismutated such that a glycan cannot be attached to the glycosylation site.In another embodiment, an attached glycan can be removed from themolecule, e.g., by an enzymatic process, e.g., by incubating withenzymes that remove glycans or have deglycosylating activity. In yetanother embodiment, glycosylation of the molecule can be inhibited,e.g., by use of a glycosylation inhibitor (that inhibits a glycosylatingenzyme). In another embodiment, the molecule, e.g., the polypeptide, canbe produced by a host cell that does not glycosylate, e.g., E. coli. Forexample, a Cel3A enzyme is aglycosylated when one or more site in theprotein that normally has a glycan group attached to it when the Cel3Aenzyme is produced in T. reesei does not have a glycan attached at thatsite.

The term “biomass”, as used herein, refers to any non-fossilized,organic matter. The various types of biomass include plant biomass(e.g., lignocellulosic and cellulosic biomass), microbial biomass,animal biomass (any animal by-product, animal waste, etc.) and municipalwaste biomass (residential and light commercial refuse with recyclablessuch as metal and glass removed). Plant biomass refers to anyplant-derived organic matter (woody or non-woody). Plant biomass caninclude, but is not limited to, agricultural or food crops (e.g.,sugarcane, sugar beets or corn kernels) or an extract therefrom (e.g.,sugar from sugarcane and corn starch from corn), agricultural cropwastes and residues such as corn stover, wheat straw, rice straw, sugarcane bagasse, and the like. Plant biomass further includes, but is notlimited to, trees, woody energy crops, wood wastes and residues such assoftwood forest thinnings, barky wastes, sawdust, paper and pulpindustry waste streams, wood fiber, and the like. Additionally, grasscrops, such as switchgrass and the like have potential to be produced ona large-scale as another plant biomass source. For urban areas, the bestpotential plant biomass feedstock includes yard waste (e.g., grassclippings, leaves, tree clippings, and brush) and vegetable processingwaste.

The term “biomass degrading enzymes”, as used herein, refers to enzymesthat break down components of the biomass matter described herein intointermediates or final products. For example, biomass-degrading enzymesinclude at least ligninases, endoglucancases, cellobiases, xylanases,and cellobiohydrolases. Biomass-degrading enzymes are produced by a widevariety of microorganisms, and can be isolated from the microorganisms,such as T. reesei.

The term “biomass degrading activity”, as used herein, refers toenzymatic activity that breaks down components of the biomass matterdescribed herein into intermediates or final products. Biomass-degradingactivity includes at least ligninase activity, endoglucanase activity,cellobiase activity, cellobiohydrolase activity, and xylanase activity.For example, a polypeptide having biomass degrading activity is acellobiase such as Cel3a from T. reesei.

The term “cellobiase”, as used herein, refers to an enzyme thatcatalyzes the hydrolysis of a dimer, trimer, tetramer, pentamer,hexamer, heptamer, octamer, or an oligomer of glucose, or an oligomer ofglucose and xylose, to glucose and/or xylose. For example, thecellobiase is beta-glucosidase, which catalyzes beta-1,4 bonds incellobiose to release two glucose molecules.

The term “cellobiase activity”, as used herein, refers to the activityof a category of cellulases that catalyze the hydrolysis of cellobioseto glucose, e.g., catalyzes the hydrolysis of beta-D-glucose residues torelease beta-D-glucose. Cellobiase activity can be determined accordingto the assays described herein, e.g., in Example 4. One unit ofcellobiase activity can be defined as [glucose] g/L/[Cel3a] g/L/30minutes.

The term “cellobiohydrolase” as used herein, refers to an enzyme thathydrolyzes glycosidic bonds in cellulose. For example, thecellobiohydrolase is 1,4-beta-D-glucan cellobiohydrolase, whichcatalyzes the hydrolysis of 1,4-beta-D-glucosidic linkages in cellulose,cellooligosaccharides, or any beta-1,4-linked glucose containingpolymer, releasing oligosaccharides from the polymer chain.

The term “cellobiohydrolase activity”, as used herein, refers to theactivity of an enzyme that catalyzes the hydrolysis of glycosidic bondsin cellulose, specifically, the hydrolysis of 1,4-beta-D-glucosidiclinkages in cellulose, cellooligosaccharides, or any beta-1,4-linkedglucose-containing polymer, to release cellobiose from the ends of thesaccharide chain, e.g., from the reducing or the non-reducing ends ofthe chain. Cellobiohydrolase activity can be determined according to theassays described herein. One unit of cellobiohydrolase activity can bedefined, for example, as the amount of enzyme that releases 1 μM ofglucose equivalent from substrate (e.g., Avicel) per minute.

The term “endoglucanase” as used herein, refers to an enzyme thatcatalyzes the hydrolysis of internal (3-1,4 glycosidic bonds. Forexample, the endoglucanase is endo-1,4-(1,3; 1,4)-beta-D-glucan4-glucanohydrolase, which catalyses endohydrolysis of1,4-beta-D-glycosidic linkages in cellulose, cellulose derivatives (suchas carboxymethyl cellulose and hydroxyethyl cellulose), lichenan,beta-1,4 bonds in mixed beta-1,3 glucans such as cereal beta-D-glucansor xyloglucans, and other plant material containing cellulosiccomponents.

The term “endoglucanase activity” as used herein, refers to the activityof an enzyme that catalyzes the endohydrolysis of the internalglycosidic bonds, e.g., internal beta-1,4 glycosidic bonds, ofcellulose, cellulose derivatives (such as carboxymethyl cellulose andhydroxyethyl cellulose), lichenan, beta-1,4 bonds in mixed beta-1,3glucans such as cereal beta-D-glucans or xyloglucans, and other plantmaterial containing cellulosic components. Endoglucanase activity can bedetermined according to the assays described herein. One unit ofendoglucanase activity can be defined, for example, as the amount ofenzyme that increases the concentration of the reducing ends by 1 μMfrom substrate per minute.

The term “enzyme mixture” as used herein, refers to a combination of atleast two different enzymes, or two different variants of an enzyme(e.g., a glycosylated and an aglycosylated version of an enzyme). Theenzyme mixture referred to herein includes at least the aglycosylatedpolypeptide having cellobiase activity described herein. In oneembodiment, the enzyme mixture includes one or more of a cellobiase, anendoglucanase, a cellobiohydrolase, a ligninase, and/or a xylanase. Insome embodiments, the enzyme mixture includes a cell, e.g., amicroorganism, which expresses and, e.g., secretes, one or more of theenzymes. For example, the enzyme mixture can include an aglycosylatedpolypeptide described herein and a cell, e.g., a microorganism, whichexpresses and, e.g., secretes, one or more additional enzymes and/orvariants of the polypeptide.

The term “inclusion body” as used herein, refers to insoluble aggregatesproduced by a microorganism, e.g., a host cell, containing one or moreof the following: a heterologously expressed polypeptide, e.g., apolypeptide having biomass-degrading activity, cellular debris, one ormore ribosomal component, one or more protein endogenously expressedfrom the host cell, one or more nucleic acids (RNA and/or DNA), or anycombination thereof. Inclusion bodies commonly occur in host cells,e.g., bacterial cells, during high levels of expression of a recombinantprotein. The heterologously expressed polypeptides found in theinclusion body may be partially unfolded, partially misfolded, orpartially denatured.

The term “ligninase” as used herein, refers to an enzyme that catalyzesthe breakdown of lignin, commonly found in the cell walls of plants,such as by an oxidation reaction. Ligninases include lignin-modifyingenzymes, lignin peroxidases and laccases.

The term “ligninase activity” as used herein, refers to the activity ofan enzyme that catalyzes the breakdown of lignin and lignin-likepolymers by an oxidation reaction. Ligninase activity can be determinedaccording to the assays described herein.

The terms “nucleic acid” or “polynucleotide” are used interchangeable,and refer to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) andpolymers thereof in either single- or double-stranded form. Unlessspecifically limited, the term encompasses nucleic acids containingknown analogues of natural nucleotides that have similar bindingproperties as the reference nucleic acid and are metabolized in a mannersimilar to naturally occurring nucleotides. Unless otherwise indicated,a particular nucleic acid sequence also implicitly encompassesconservatively modified variants thereof (e.g., degenerate codonsubstitutions), alleles, orthologs, SNPs, and complementary sequences aswell as the sequence explicitly indicated. Specifically, degeneratecodon substitutions may be achieved by generating sequences in which thethird position of one or more selected (or all) codons is substitutedwith mixed-base and/or deoxyinosine residues (Batzer et al., NucleicAcid Res. 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608(1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)).

The term “operably linked”, as used herein, refers to a configuration inwhich a control or regulatory sequence is placed at a position relativeto a nucleic acid sequence that encodes a polypeptide, such that thecontrol sequence influences the expression of a polypeptide (encoded bythe DNA sequence). In an embodiment, the control or regulatory sequenceis upstream of a nucleic acid sequence that encodes a polypeptide withcellobiase activity. In an embodiment, the control or regulatorysequence is downstream of a nucleic acid sequence that encodes apolypeptide with cellobiase activity.

The terms “peptide,” “polypeptide,” and “protein” are usedinterchangeably, and refer to a compound comprised of amino acidresidues covalently linked by peptide bonds. A protein or peptide mustcontain at least two amino acids, and no limitation is placed on themaximum number of amino acids that can comprise a protein's or peptide'ssequence. Polypeptides include any peptide or protein comprising two ormore amino acids joined to each other by peptide bonds. “Polypeptides”include, for example, biologically active fragments, substantiallyhomologous polypeptides, oligopeptides, homodimers, heterodimers,variants of polypeptides, modified polypeptides, derivatives, analogs,fusion proteins, among others. A polypeptide includes a natural peptide,a recombinant peptide, or a combination thereof. A “plurality ofpolypeptides” refers to two or more polypeptides, e.g., 2, 3, 4, 5, 6,7, 8, 9, 10, 20, 50, 100, 200, or 500 or more polypeptides.

The term “promoter”, as used herein, refers to a DNA sequence recognizedby the synthetic machinery of the cell, or introduced syntheticmachinery, required to initiate the specific transcription of apolynucleotide sequence.

The term “regulatory sequence” or “control sequence”, as usedinterchangeably herein, refers to a nucleic acid sequence which isrequired for expression of a nucleic acid product. In some instances,this sequence may be a promoter sequence and in other instances, thissequence may also include an enhancer sequence and other regulatoryelements which are required for expression of the gene product. Theregulatory/control sequence may, for example, be one which expresses thenucleic acid product in a regulated manner, e.g., inducible manner.

The term “constitutive” promoter refers to a nucleotide sequence which,when operably linked with a polynucleotide which encodes a polypeptide,causes the polypeptide to be produced in a cell under most or allphysiological conditions of the cell. In an embodiment, the polypeptideis a polypeptide having cellobiase activity.

The term “inducible” promoter refers to a nucleotide sequence which,when operably linked with a polynucleotide which encodes a polypeptide,causes the polypeptide to be produced in a cell substantially only whenan inducer which corresponds to the promoter is present in the cell. Inan embodiment, the polypeptide is a polypeptide having cellobiaseactivity.

The term “repressible” promoter refers to a nucleotide sequence, whichwhen operably linked with a polynucleotide which encodes a polypeptide,causes the polypeptide to be produced in a cell substantially only untila repressor which corresponds to the promoter is present in the cell. Inan embodiment, the polypeptide is a polypeptide having cellobiaseactivity.

The term “solubilizing agent” refers to an agent that has the capacityfor disrupting non-covalent bonds, e.g., hydrogen bonds, hydrophobicinteractions, van der Waals interactions, dipole-dipole interactions,ionic interactions, pi-stacking, or any combination thereof. Thedisruption of the non-covalent bonds leads to the solubilization, ordissolution, of previously insoluble matter into solution. Specifically,a solubilizing agent used herein increases the ability of polypeptideshaving biomass-degrading activity described herein that have aggregatedinto inclusion bodies to dissolve into solution, e.g., water-basedsolution or a buffer. Examples of suitable solubilizing agents aredescribed herein.

The term “xylanase” as used herein, refers to enzymes that hydrolyzexylan-containing material. Xylan is polysaccharide comprising units ofxylose. A xylanase can be an endoxylanase, a beta-xylosidase, anarabinofuranosidase, an alpha-glucuronidase, an acetylxylan esterase, aferuloyl esterase, or an alpha-glucuronyl esterase.

The term “xylanase activity” as used herein, refers to the activity ofenzymes that catalyze the endohydrolysis of 1,4-btea-D-xylosidiclinkages in xylans and xylan-like polymers. Xylanase activity can bedetermined according to the assays described herein. One unit ofxylanase activity will release 1 μM of xylose equivalent from xylan perminute.

Description

High level of expression of recombinant proteins in host cells such asE. coli often leads to accumulation of the recombinant proteins intoinactive, misfolded and insoluble aggregates within the host cell. Theseinsoluble aggregates are called inclusion bodies and can also containother components endogenous to the host cell, such as protein, ribosomalcomponents, nucleic acids, and cellular debris. As much as 70-80% ofproteins produced by recombinant techniques can form inclusion bodies,thereby significantly reducing the yield of active recombinant proteinthat can be readily isolated from the host cells.

Solubilization of the recombinant proteins from the inclusion bodies canbe achieved through treatment with chaotropic agents, e.g., highconcentrations of urea, which disrupt hydrogen bonds and hydrophobicinteractions. However, such solubilization processes often result indenaturation of the protein and loss of native function or enzymaticactivity. The soluble denatured proteins can be refolded to their nativestate after removal of chaotropic agents, however, refolding ofrecombinant proteins into bioactive forms with enzymatic activity can becumbersome, costly, and result in low recovery of the final product.

The present invention is based, at least in part, on the surprisingdiscovery that a heterologously expressed cellobiase that has beensolubilized from inclusion bodies by urea retains cellobiase activity.The recovery of heterologously expressed cellobiase from the inclusionbodies increased the total yield of cellobiase by 30-40%. Furthermore,the presence of the solubilizing agent, e.g., urea, from the addition ofthe solubilized biomass-degrading enzyme, e.g., cellobiase, does notadversely affect the saccharification reaction for converting biomass toa sugar product and/or the yield of products.

Accordingly, the present invention provides methods for solubilizing apolypeptide having biomass-degrading activity from inclusion bodies,where the resulting solubilized polypeptide retains biomass-degradingactivity, whereby the additional processing steps of refolding thepolypeptide and removing the solubilizing agent is not required. Thepresent invention provides methods for increasing the recovery ofheterologously-expressed biomass-degrading enzymes from inclusionbodies, while retaining enzymatic activity, and use of the recoveredbiomass-degrading enzymes in methods for converting a biomass intoproducts, e.g., by saccharification.

Polypeptides Having Biomass-Degrading Activity

The present disclosure provides a polypeptide, a plurality ofpolypeptides, having a biomass-degrading activity. In embodiments, thepolypeptide having biomass-degrading activity, or plurality thereof, ispresent in a mixture with one or more solubilizing agent. Some mixturesmay also contain one or more proteins associated with an inclusion body.In other embodiments, the mixture does not contain one or more proteinsassociated with the inclusion body, e.g., the polypeptide or pluralitythereof having biomass-degrading activity was purified from one or moreproteins associated with the inclusion body.

For example, the polypeptide has cellobiase activity, ligninaseactivity, endoglucanase activity, cellobiohydrolase activity, orxylanase activity.

In an embodiment, the polypeptide is a cellobiase. A cellobiase is anenzyme that hydrolyzes beta-1,4 bonds in its substrate, e.g.,cellobiose, to release two glucose molecules. Cellobiose is a watersoluble 1,4-linked dimer of glucose. In an embodiment, the polypeptideis Cel3a. Cel3a (also known as BglI) is a cellobiase that was identifiedin Trichoderma reesei. The amino acid sequence for Cel3a (GenBankAccession No. NW_006711153) is provided below:

(SEQ ID NO: 1) MGDSHSTSGASAEAVVPPAGTPWGTAYDKAKAALAKLNLQDKVGIVSGVGWNGGPCVGNTSPASKISYPSLCLQDGPLGVRYSTGSTAFTPGVQAASTWDVNLIRERGQFIGEEVKASGIHVILGPVAGPLGKTPQGGRNWEGFGVDPYLTGIAMGQTINGIQSVGVQATAKHYILNEQELNRETISSNPDDRTLHELYTWPFADAVQANVASVMCSYNKVNTTWACEDQYTLQTVLKDQLGFPGYVMTDWNAQHTTVQSANSGLDMSMPGTDFNGNNRLWGPALTNAVNSNQVPTSRVDDMVTRILAAWYLTGQDQAGYPSFNISRNVQGNHKTNVRAIARDGIVLLKNDANILPLKKPASIAVVGSAAIIGNHARNSPSCNDKGCDDGALGMGWGSGAVNYPYFVAPYDAINTRASSQGTQVTLSNTDNTSSGASAARGKDVAIVFITADSGEGYITVEGNAGDRNNLDPWHNGNALVQAVAGANSNVIVVVHSVGAIILEQILALPQVKAVVWAGLPSQESGNALVDVLWGDVSPSGKLVYTIAKSPNDYNTRIVSGGSDSFSEGLFIDYKHFDDANITPRYEFGYGLSYTKFNYSRLSVLSTAKSGPATGAVVPGGPSDLFQNVATVTVDIANSGQVTGAEVAQLYITYPSSAPRTPPKQLRGFAKLNLTPGQSGTATFNIRRRDLSYWDTASQKWVVPSGSFGISVGASSRDIRLTSTLSVAGSGS

In an embodiment, the polypeptide is a ligninase. A ligninase is anenzyme that breaks down lignin, which is a complex polymer of aromaticalcohols known as monolignols and plays an integral part of thesecondary cell walls of plants and some algae. Ligninases include ligninperoxidases,1,2-bis(3,4-dimethoxyphenyl)propane-1,3-diol:hydrogen-peroxideoxidoreductase, diarylpropane oxygenase, ligninase I, diarylpropaneperoxidase, LiP, hydrogen-peroxide oxidoreductase (C—C-bond-cleaving),and some laccases. Examples of ligninases include CIP2 from Trichodermareesei; LPOA, GLG2, GLG4, LIPA, GLG5, GLG3, GLG6, and LIPB fromPhanerochaete chrysosporium; ligninase-3 from Phelbia radiate; LigninaseA and B from Coriolus versicolor; and LPG I and LPGIV Coriolusversicolor.

In an embodiment, the polypeptide is an endoglucanase. An endoglucanaseis an enzyme that catalyzes the hydrolysis of cellulose. Specifically,the endoglucanases cleave the internal bonds of the cellulose chain.Endoglucanases are produced by fungi, bacteria, and protozoans.Endoglucanases are also known as beta-1-4 endoglucanase, 4-beta-D-glucancellobiohydrolase, exo-cellobiohydrolase, beta-1,4-glucancellobiohydrolase, beta-1,4-glucan cellobiosylhydrolase, 1,4-beta-glucancellobiosidase, exoglucanase, avicelase, CBH 1, C1 cellulase,cellobiohydrolase I, cellobiohydrolase, exo-beta-1,4-glucancellobiohydrolase, 1,4-beta-D-glucan cellobiohydrolase, orcellobiosidase. Examples of endoglucanases include Cel5A, Cel5B, Cel7B,Cel12A, Cel45A, Cel61A, Cel61B, and Cel74A from Trichoderma reesei.

In an embodiment, the polypeptide is a cellobiohydrolase, also known asexoglucanase. A cellobiohydrolase catalyzes the hydrolysis of1-4-beta-D-glucosidic linkages in oligosaccharides containing thatlinkage, e.g., cellulose and cellotetraose, thereby releasing cellobiosefrom the non-reducing ends of the chains. Examples of cellobiohydrolasesinclude cellobiohydrolase I (CBHI) and cellobiohydrolase II (CBHII) fromTrichoderma reesei.

In an embodiment, the polypeptide is a xylanase. Xylanases are alsoknown as endo-(1-4)-beta-xylan 4-xylanohydrolase, endo-1,4-xylanase,endo-1,4-beta-xylanase, beta-1,4-xylanase, endo-1,4-beta-D-xylanase,1,4-beta-xylan xylanohydrolase, beta-xylanase, beta-1,4-xylanxylanohydrolase, beta-D-xylanase. A xylanase breaks down a component ofplant cell walls called hemicellulose, e.g., degrades polysaccharides,such as xylan, e.g., beta-1,4-xylan, glucuronoxylan, arabinoxylan,glucomannan, and xyloglucan, to release xylose. Examples of xylanasesinclude Xyn1, Xyn2, and Xyn3 from Trichoderma reesei; and TERTU_1599,TERTU_3603, TERTU_2546, and TERTU_4506 from Terendinibacter turneraeT7901.

The present disclosure also provides functional variants of apolypeptide having biomass-degrading activity described herein. In anembodiment, a functional variant has an amino acid sequence with atleast 60%, at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 91%, at least 92%, at least 93%, atleast 94%, at least 95%, at least 96%, at least 97%, at least 98%, or atleast 99% identity to a biomass-degrading enzyme described herein, or afunctional fragment thereof, e.g., at least 80%, at least 85%, at least90%, at least 91%, at least 92%, at least 93%, at least 94%, at least95%, at least 96%, at least 97%, at least 98%, or at least 99% identityto a biomass-degrading enzyme described herein, or a functional fragmentthereof.

In an embodiment, a functional variant has an amino acid sequence withat least 60%, at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 91% identity, at least 92%, at least93%, at least 94%, at least 95%, at least 96%, at least 97%, at least98%, or at least 99% identity to Cel3a produced by T. reesei or SEQ IDNO: 1, or a functional fragment thereof.

Percent identity in the context of two or more amino acid or nucleicacid sequences, refers to two or more sequences that are the same. Twosequences are “substantially identical” if two sequences have aspecified percentage of amino acid residues or nucleotides that are thesame (e.g., 60% identity, optionally 70%, 71%, 72%, 73%, 74%, 75%, 76%,77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity over a specifiedregion, or, when not specified, over the entire sequence), when comparedand aligned for maximum correspondence over a comparison window, ordesignated region as measured using one of the following sequencecomparison algorithms or by manual alignment and visual inspection.Optionally, the identity exists over a region that is at least about 50nucleotides, 100 nucleotides, 150 nucleotides, in length. Morepreferably, the identity exists over a region that is at least about 200or more amino acids, or at least about 500 or 1000 or more nucleotides,in length.

For sequence comparison, one sequence typically acts as a referencesequence, to which one or more test sequences are compared. When using asequence comparison algorithm, test and reference sequences are enteredinto a computer, subsequence coordinates are designated, if necessary,and sequence algorithm program parameters are designated. Defaultprogram parameters can be used, or alternative parameters can bedesignated. The sequence comparison algorithm then calculates thepercent sequence identities for the test sequences relative to thereference sequence, based on the program parameters. Methods ofalignment of sequences for comparison are well known in the art. Optimalalignment of sequences for comparison can be conducted, e.g., by thelocal homology algorithm of Smith and Waterman, (1970) Adv. Appl. Math.2:482c, by the homology alignment algorithm of Needleman and Wunsch,(1970) J. Mol. Biol. 48:443, by the search for similarity method ofPearson and Lipman, (1988) Proc. Nat'l. Acad. Sci. USA 85:2444, bycomputerized implementations of these algorithms (GAP, BESTFIT, FASTA,and TFASTA in the Wisconsin Genetics Software Package, Genetics ComputerGroup, 575 Science Dr., Madison, Wis.), or by manual alignment andvisual inspection (see, e.g., Brent et al., (2003) Current Protocols inMolecular Biology).

Two examples of algorithms that are suitable for determining percentsequence identity and sequence similarity are the BLAST and BLAST 2.0algorithms, which are described in Altschul et al., (1977) Nuc. AcidsRes. 25:3389-3402; and Altschul et al., (1990) J. Mol. Biol.215:403-410, respectively. Software for performing BLAST analyses ispublicly available through the National Center for BiotechnologyInformation.

Functional variants may comprise one or more mutations, such that thevariant retains biomass-degrading activity that is better than thebiomass-degrading activity of a biomass-degrading enzyme describedherein produced by the microorganism from which the enzyme originatesfrom. In an embodiment, the functional variant has at least 10%, atleast 20%, at least 30%, at least 40%, at least 50%, at least 60%, atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, or at least 99% (e.g., at least 80%, at least 85%, at least90%, at least 95%, or at least 99%) of the biomass-degrading activity asa biomass-degrading enzyme as produced by E. coli. In embodiments, thefunctional variant has at least 200%, at least 300%, at least 400%, atleast 500%, at least 1000% or more of the biomass-degrading activity asa biomass-degrading enzyme produced by E. coli or the microorganism fromwhich the enzyme originates from. Biomass-degrading activity can betested using the functional assays described herein. In one embodiment,the functional variant retains cellobiase activity that is better thanthe cellobiase activity of Cel3a as produced by T. reesei. In anotherembodiment, the functional variant has at least 10%, at least 20%, atleast 30%, at least 40%, at least 50%, at least 60%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or atleast 99% (e.g., at least 80%, at least 85%, at least 90%, at least 95%,or at least 99%) of the cellobiase activity as a Cel3a or enzymecomprising SEQ ID NO: 1 as produced by E. coli. In embodiments, thefunctional variant has increased biomass-degrading activity compared toa biomass-degrading enzyme described herein, e.g., at least 200%, atleast 300%, at least 400%, at least 500%, at least 1000% or more of thebiomass-degrading activity of a biomass-degrading enzyme describedherein, e.g., cellobiase activity as a Cel3a or enzyme comprising SEQ IDNO: 1 produced by E. coli or the microorganism from which the enzymeoriginates from.

The mutations present in a functional variant include amino acidsubstitutions, additions, and deletions. Mutations can be introduced bystandard techniques known in the art, such as site-directed mutagenesisand PCR-mediated mutagenesis. The mutation may be a conservative aminoacid substitution, in which the amino acid residue is replaced with anamino acid residue having a similar side chain. Families of amino acidresidues having similar side chains have been defined in the art. Thesefamilies include amino acids with basic side chains (e.g., lysine,arginine, histidine), acidic side chains (e.g., aspartic acid, glutamicacid), uncharged polar side chains (e.g., glycine, asparagine,glutamine, serine, threonine, tyrosine, cysteine, tryptophan), nonpolarside chains (e.g., alanine, valine, leucine, isoleucine, proline,phenylalanine, methionine), beta-branched side chains (e.g., threonine,valine, isoleucine) and aromatic side chains (e.g., tyrosine,phenylalanine, tryptophan, histidine). Thus, one or more amino acidresidues within the polypeptide having cellobiase activity of thedisclosure can be replaced with other amino acids from the same sidechain family, and the resultant polypeptide retains cellobiase activitycomparable (e.g., at least 80%, 85%, 90%, 95%, or 99% of the cellobiaseactivity) to that of the wild-type polypeptide. Alternatively, themutation may be an amino acid substitution in which an amino acidresidue is replaced with an amino acid residue having a different sidechain.

Such mutations may alter or affect various enzymatic characteristics ofthe biomass-degrading enzyme, e.g., cellobiase, ligninase,endoglucanase, or cellobiohydrolase. For example, such mutations mayalter or affect the biomass-degrading activity, thermostability, optimalpH for reaction, enzyme kinetics, or substrate recognition of thebiomass-degrading enzyme. In some embodiments, a mutation increases thebiomass-degrading activity of the variant in comparison to thebiomass-degrading enzyme, e.g., cellobiase produced by T. reesei and/orSEQ ID NO: 1 produced in E. coli. In some embodiments, a mutationincreases or decreases the thermostability of the variant in comparisonto a wild-type biomass degrading enzyme, e.g., a cellobiase and/or SEQID NO: 1 produced in E. coli. In an embodiment, a mutation changes thepH range at which the variant optimally performs the biomass-degradingreaction in comparison to wild-type biomass-degrading enzyme, e.g.,wild-type cellobiase and/or SEQ ID NO: 1 produced in E. coli. In anembodiment, a mutation increases or decreases the kinetics of thebiomass-degrading reaction (e.g., k_(cat), K_(M) or K_(D)) in comparisonto wild-type biomass-degrading enzyme, e.g., wild-type cellobiase and/orSEQ ID NO: 1 produced in E. coli. In an embodiment, a mutation increasesor decreases the ability of the cellobiase to recognize or bind to thesubstrate (e.g., cellobiose) in comparison to wild-type cellobiaseand/or SEQ ID NO:1 produced in E. coli.

The present invention also provides functional fragments of apolypeptide having biomass-degrading activity, e.g., cellobiaseactivity, as described herein, e.g., Cel3a or SEQ ID NO: 1. One ofordinary skill in the art could readily envision that a fragment of apolypeptide having biomass-degrading activity as described herein thatcontains the functional domains responsible for enzymatic activity wouldretain functional activity, e.g., biomass-degrading activity, andtherefore, such fragments are encompassed in the present invention. Inan embodiment, the functional fragment is at least 700 amino acids, atleast 650 amino acids, at least 600 amino acids, at least 550 aminoacids, at least 500 amino acids, at least 450 amino acids, at least 400amino acids, at least 350 amino acids, at least 300 amino acids, atleast 250 amino acids, at least 200 amino acids, at least 150 aminoacids, at least 100 amino acids, or at least 50 amino acids in length.In an embodiment, the functional fragment is 700 to 744 amino acids, 650to 699 amino acids, 600 to 649 amino acids, 550 to 599 amino acids, 500to 549 amino acids, 450 to 499 amino acids, 400 to 449 amino acids, 350to 399 amino acids, 300 to 349 amino acids, 250 to 299 amino acids, 200to 249 amino acids, 150 to 199 amino acids, 100 to 149 amino acids, or50 to 99 amino acids. With regard to the ranges of amino acid lengthdescribed above, the lowest and highest values of amino acid length areincluded within each disclosed range. In an embodiment, the functionalfragment has at least 10%, at least 20%, at least 30%, at least 40%, atleast 50%, at least 60%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or at least 99% of thebiomass-degrading activity as a wild-type biomass-degrading enzymedescribed herein, or the biomass-degrading enzyme produced in E. coli.In an embodiment, the functional fragment has at least 10%, at least20%, at least 30%, at least 40%, at least 50%, at least 60%, at least70%, at least 75%, at least 80%, at least 85%, at least 90%, at least95%, or at least 99% of the cellobiase activity as wild-type Cel3a orthe polypeptide comprising SEQ ID NO: 1 produced in E. coli.

Assays for detecting cellobiase activity are known in the art. Forexample, detection of the amount of glucose released from cellobiose canbe determined by incubating purified cellobiase with substrate, e.g.,cellobiose, D-(+)-cellobiose, and detecting the resultant amount of freeglucose after completion of the reaction. The amount of free glucose canbe determined using a variety of methods known in the art. For example,dilutions of purified cellobiase are prepared in a buffer containing 50mM sodium citrate, pH 5.0 NaOH. The cellobiose substrate is added to thepurified cellobiase in an amount such that the final concentration ofcellobiose in the reaction mixture is 30 mM. The reaction mixture isincubated under conditions suitable for the reaction to occur, e.g., ina shaker (700 rpm) at 48° C. for 30 minutes. To stop the reaction, thereaction mixture is heated for 5 minutes at 100° C. The reaction mixtureis filtered through a 0.45 μm filter and the filtrate is analyzed toquantify the amount of glucose and/or cellobiase. A YSI instrument thatmeasures analytes such as glucose can be used to determine theconcentration of glucose produced from the reaction. Alternatively, UPLC(Ultra Performance Liquid Chromatography) can be used to determine theconcentration of glucose and cellobiose from the reaction. This assaycan be formatted in a single reaction or in multiple reaction formats,e.g., 96 well format. In some embodiments, the multiple reaction formatmay be preferred to generate an activity curve representing cellobiaseactivity with respect to different concentrations of the purifiedcellobiase. The concentration of the purified cellobiase can bedetermined using a standard Bradford assay. Dilutions of the purifiedcellobiase assay are prepared, e.g., 2-fold dilutions, and are aliquotedinto a 96 well plate, e.g., 12 wells of 2-fold dilutions. Cellobiosesubstrate is added as previously described, such that the finalconcentration of cellobiase in the reaction is 30 mM. The plate issealed and treated under conditions sufficient for the cellobiasereaction to occur, and then under conditions to stop the reaction. Thereaction is then filtered through a 96 well format 0.45 μm membrane(e.g., Durapore) and analyzed by YSI and/or HPLC methods, e.g., UPLC.

This activity assay can also be used to determine the concentration, ortiter, of a cellobiase in a sample with unknown concentration bygenerating a standard curve of activity of known concentrations of thecellobiase to extrapolate the concentration for the unknownconcentration sample. For example, two-fold serial dilutions of a knownconcentration of the cellobiase are prepared in one row of a 96 wellplate, e.g., 12 two-fold serial dilutions. The other rows containtwo-fold serial dilutions of other remaining samples whose titer is tobe determined, e.g., the crude lysate sample or solubilized inclusionbody sample. The dilutions are incubated with a D-(+)-Cellobiose (Fluka)substrate solution in 50 mM sodium citrate monobasic buffer at pH 5.0,at 48° C. for 30 minutes. After 30 minutes, the samples are heated to100° C. for 10 minutes to stop the reaction. Samples are analyzed forglucose and cellobiose using the YSI Biochemistry analyser (YSI LifeSciences) and/or HPLC methods. Using the samples of known concentration,a standard curve is generated using the data points within the linearrange of the assay. The cellobiase activity detected from the sampleswith unknown titer can be compared to the standard curve to determinethe titer of cellobiase in these sample.

Units of activity are only relative if calculated using values withinthe linear range of the assay. The linear range of the assay is definedas using glucose values that are less than 30% of the original solublesubstrate load. In addition, glucose values lower than 0.05 g/L areomitted due to instrumentation reporting levels. One unit of cellobiaseactivity is defined as the amount of glucose per the amount of Cel3a per30 minutes: [Glucose]g/L/[Cel3a]g/L/30 min.

In other embodiments, a colorimetric/fluorometric assay can be used. Thepurified cellobiase is incubated with substrate cellobiose underconditions for the reaction to occur. Detection of the product glucoseis as follows. Glucose oxidase is added to the mixture, which oxidizesglucose (the product) to gluconic acid and hydrogen peroxide. Peroxidaseand o-dianisidine is then added. O-dianisidine reacts with the hydrogenperoxide in the presence of peroxidase to form a colored product.Sulfuric acid is added, which reacts with the oxidized o-dianisidinereacts to form a more stable colored product. The intensity of the colorwhen measured, e.g., by spectrophotometer or colorimeter, e.g., at 540nm, is directly proportional to the glucose concentration. Suchcolorimetric/fluorometric glucose assays are commercially available, forexample from Sigma Aldrich, Catalog No. GAGO-20.

Assays for detecting ligninase activity are known in the art. Ligninaseactivity can be measured by determining the rate of oxidation ofveratryl alcohol to veratrylaldehyde (abbreviated as VAO for veratrylalcohol oxidation). Reaction mixtures are prepared, and containdilutions of enzyme, 2 mM veratryl alcohol, 0.4 mM H₂O₂ and either 20 or100 mM sodium tartrate, pH 2.9 in a final volume of 0.5 ml. Thereactions were started by H₂O₂ addition and were monitored byspectrophotometry at 310 nm. Protein was determined according toBradford, M. M., (1976) Anal. Biochem. 72:248-254, using bovine serumalbumin (Sigma Chemical Co., St. Louis, Mo.) as standard or by using the409 nm absorbance of a protein solution and calculating protein amountfrom the extinction coefficient of ligninase.

Assays for detecting endoglucanase activity are known in the art. Forexample, endoglucanase activity can be determined by measuring thehydrolysis of substrate carboxymethyl cellulose (CMC) and quantifyingthe concentration of reducing end by BCA method, in which the totalconcentration of reducing ends is exhibited by a color change of thesample solution in proportion to the concentration of the reducing ends.First, the polypeptide having biomass-degrading activity is diluted in a50 mM citrate buffer at pH 4.8. CMC solution (0.05% w/v CMC in thesodium citrate buffer) is added to a reaction tube and equilibrated at50 C. The diluted enzyme samples are added to the reaction and incubatedat 50 C for 10 minutes. BCA reagents are added and incubated at 75 C for30 minutes. The absorbance is read at 560 nm after subtracting thereadings for the enzyme blanks and the substrate blank. Enzyme activitycan be calculated based on a linear range between reducing endconcentration and enzyme concentrations. Other endoglucanase activityassays are known in the art, for example, by determining a reduction insubstrate viscosity (Zhang et al., Biotechnol Adv, 2006, 24:452-481).

Assays for detecting cellobiohydrolase activity are known in the art.Cellobiohydrolase activity can be determined by measuring solublesubstrate released from substrate Avicel in a phenol-sulfuric assay. AnAvicel solution (1.25% w/v in acetate buffer) is aliquoted into reactiontubes, and dilutions of the enzyme is prepared. Both substrate andenzyme solutions are equilibrated at 50 C. The diluted enzyme solutionsare added to the substrate and incubated for a time sufficient for thereaction to occur, e.g., at 50 C for 2 hours. The reactions are stoppedby submerging the samples into an ice cold water bath. The samples arecentrifuged to separate the samples into a soluble and insolublefraction. The total concentration of soluble sugars in the solublefraction is determined by phenol-sulfuric assay. Specifically, analiquot of the soluble fraction is mixed with 5% phenol, andconcentrated sulfuric acid is added. The reaction is cooled to roomtemperature (about 20-30 minutes), and absorbance of the samples areread at 490 nm. The enzyme activity is calculated on the basis of alinear relationship between total soluble sugar release and the enzymedilution. Other cellobiohydrolase activity assays are described in Zhanget al., Biofuels: Methods and Protocols, Vol. 581, pages 213-231.

Assays for detecting xylanase activity are known in the art. Xylanaseactivity can be determined by measuring the level of xylose releasedfrom a xylan substrate by a colorimetric assay. Xylan substrate isprepared as a 1.0% w/v solution in 50 mM sodium acetate buffer, pH 4.5.Dilutions of the enzyme of prepared. Xylan and the enzyme dilutions aremixed, and incubated under conditions sufficient for the reaction tooccur, e.g., 30 C for 10 minutes. Then a solution containing 16 mMcopper sulfate, 1.3M sodium sulfate, 226 mM sodium carbonate, 190 mMsodium bicarbonate, and 43 mM sodium potassium tartrate is added to thereaction. The reaction is then boiled for 10 minutes, and allowed tocool to room temperature. A solution containing 40 mM molybdic acid, 19mM arsenic acid, and 756 mM sulfuric acid is added. The reaction isshaken or vortexed until the foaming stops and any preceiptate presentis dissolved. The reaction is centrifuged to clarify, then the solutionsare ready by spectrophotometer at 540 nM, and enzyme activity iscalculated on the basis of a linear relationship between total solublesugar release and the enzyme dilution.

Aglycosylated Polypeptides

Any of the polypeptides having biomass-degrading activity describedherein, e.g., cellobiase activity, can be glycosylated or aglycosylated.An aglycosylated polypeptide having biomass-degrading activity may besolubilized from an inclusion body, as described herein. Alternatively,an aglycosylated polypeptide having biomass-degrading activity may beadded to a mixture comprising a polypeptide having biomass-degradingactivity that has been solubilized from an inclusion body, in which thepolypeptide that was solubilized from an inclusion body can beglycosylated or aglycosylated.

Glycosylation is the enzymatic process by which a carbohydrate isattached to a glycosyl acceptor, e.g., the nitrogen of arginine orasparginine side chains or the hydroxyl oxygen of serine, threonine, ortyrosine side chains. There are two types of glycosylation: N-linked andO-linked glycosylation. N-linked glycosylation occurs at consensus siteAsn-X-Ser/Thr, wherein the X can be any amino acid except a proline.O-linked glycosylation occurs at Ser/Thr residues. Glycosylation sitescan be predicted using various algorithms known in the art, such asProsite, publicly available by the Swiss Institute of Bioinformatics,and NetNGlyc 1.0 or NetOGlyc 4.0, publicly available by the Center forBiological Sequence Analysis.

The present invention provides methods for producing an aglycosylatedpolypeptide having biomass-degrading activity. In one embodiment, thenucleic acid encoding the polypeptide has been altered or mutated suchthat the polypeptide cannot be glycosylated, e.g., one or moreglycosylation sites are mutated such that a glycan cannot be attached tothe glycosylation site. For example, an aglycosylated polypeptide havingbiomass-degrading activity encoded by a nucleic acid sequence describedherein contains one or more mutations at one or more glycosylation siteshave been mutated such that a glycan can no longer be attached or linkedto the glycosylation site. In another example, the polypeptide havingbiomass-degrading activity encoded by a nucleic acid sequence describedherein contains one or more mutations proximal to one or moreglycosylation sites that have been mutated such that a glycan can nolonger be attached or linked to the glycosylation site. For example, themutation proximal to a glycosylation site mutates the consensus motifrecognized by the glycosylating enzyme, or changes the conformation ofthe polypeptide such that the polypeptide cannot be glycosylated, e.g.,the glycoslation site is hidden or steric hindrance due to the newconformation prevents the glycosylating enzymes from accessing theglycosylation site. A mutation proximal to a glycosylation site in thepolypeptide having biomass-degrading activity is directly adjacent to,or at least 2, at least 3, at least 4, at least 5, at least 6, at least7, at least 8, at least 9, at least 10, at least 15, at least 20, atleast 30 or at least 40 amino acids from the glycosylation site that, asa result of the proximal mutation, will not be glycosylated.

In an embodiment, one or more of the following glycosylation sites of acellobiase, e.g., Cel3a, or SEQ ID NO: 1, are mutated: the threonine atamino acid position 78, the threonine at amino acid position 241, theserine at amino acid position 343, the serine at amino acid position450, the threonine at amino acid position 599, the serine at amino acidposition 616, the threonine at amino acid position 691, the serine atamino acid position 21, the threonine at amino acid position 24, theserine at amino acid position 25, the serine at amino acid position 28,the threonine at amino acid position 38, the threonine at amino acidposition 42, the threonine at amino acid position 303, the serine atamino acid position at 398, the serine at amino acid position 435, theserine at amino acid position 436, the threonine at amino acid position439, the threonine at amino acid position 442, the threonine at aminoacid position 446, the serine at amino acid position 451, the serine atamino acid position 619, the serine at amino acid position 622, thethreonine at amino acid position 623, the serine at amino acid position626, or the threonine at amino acid position 630, or any combinationthereof. In embodiments, the glycosylation site is mutated from a serineor threonine to an alanine. For example, the aglycosylated polypeptidedescribed herein has one or more of the following mutations: T78A,T241A, S343A, S450A, T599A, S616A, T691A, S21A, T24A, S25A, S28A, T38A,T42A, T303A, T398A, S435A, S436A, T439A, T442A, T446A, S451A, S619A,S622A, T623A, S626A, or T630A, or any combination thereof.Alternatively, one or more amino acids proximal to the glycosylationsites described above are mutated.

Assays to detect whether a polypeptide is modified by a glycan (e.g.,whether the polypeptide is glycosylated or aglycosylated) are known inthe art. The polypeptide can be purified or isolated and can be stainedfor detection and quantification of glycan moieties, or the polypeptidecan be analyzed by mass spectrometry, and compared to a correspondingreference polypeptide. The reference polypeptide has the same primarysequence as the test polypeptide (of which the glycosylation state is tobe determined), but is either glycosylated or aglycosylated.

The aglycosylated polypeptides described herein may have increasedbiomass-degrading activity, e.g., cellobiase activity, compared to acorresponding glycosylated polypeptide, e.g., glycosylated Cel3apolypeptide. For example, the aglycosylated polypeptide havingbiomass-degrading activity, e.g., cellobiase activity, has at least 1%,2%, 5%, 10%, 15%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or200% biomass-degrading activity, e.g., cellobiase activity, compared tothe glycosylated polypeptide.

Nucleic Acids, Expression Vectors and Host Cells

A polypeptide having biomass-degrading activity as described herein isexpressed in host cells. In one embodiment, an expression vectorcomprising a nucleic acid sequence encoding any of the polypeptidesdescribed herein having biomass-degrading activity is introduced into ahost cell, and the host cell is cultured under conditions appropriatefor expression of the polypeptide having biomass-degrading activity. Inembodiments, the expression of the polypeptide having biomass-degradingactivity is at a level such that inclusion bodies, or aggregatescomprising the polypeptide having biomass-degrading activity, areformed. Also described herein are methods for expressing and isolatingthe soluble polypeptide having biomass-degrading activity expressed inthe host cell. Methods for solubiziling and isolating the polypeptidehaving biomass-degrading activity from the inclusion bodies is describedfurther in the section titled “Solubilization from Inclusion Bodies”.

The present invention also provides a nucleic acid sequence encoding apolypeptide having biomass-degrading activity. In an embodiment, thenucleic acid sequence encodes a ligninase, an endoglucanase, acellobiohydrolase, a xylanase, or a cellobiase described herein. Thenucleic acid sequence encoding a polypeptide having biomass-degradingactivity can be codon-optimized for increased expression in host cells.Codon optimization includes changing the nucleic acid sequence to takeinto consideration factors including codon usage bias, cryptic splicingsites, mRNA secondary structure, premature polyA sites, interaction ofcodon and anti-codon, and RNA instability motifs, to increase expressionof the encoded polypeptide in the host. Various algorithms andcommercial services for codon-optimization are known and available inthe art.

In an embodiment, the nucleic acid sequence encodes a Cel3a enzyme fromT. reesei with the amino acid sequence described herein, e.g., SEQ IDNO: 1. In an embodiment, the nucleic acid sequence that encodes Cel3a isprovided below:

(SEQ ID NO: 2) ATGCGTTACCGAACAGCAGCTGCGCTGGCACTTGCCACTGGGCCCTTTGCTAGGGCAGACAGTCACTCAACATCGGGGGCCTCGGCTGAGGCAGTTGTACCTCCTGCAGGGACTCCATGGGGAACCGCGTACGACAAGGCGAAGGCCGCATTGGCAAAGCTCAATCTCCAAGATAAGGTCGGCATCGTGAGCGGTGTCGGCTGGAACGGCGGTCCTTGCGTTGGAAACACATCTCCGGCCTCCAAGATCAGCTATCCATCGCTATGCCTTCAAGACGGACCCCTCGGTGTTCGATACTCGACAGGCAGCACAGCCTTTACGCCGGGCGTTCAAGCGGCCTCGACGTGGGATGTCAATTTGATCCGCGAACGTGGACAGTTCATCGGTGAGGAGGTGAAGGCCTCGGGGATTCATGTCATACTTGGTCCTGTGGCTGGGCCGCTGGGAAAGACTCCGCAGGGCGGTCGCAACTGGGAGGGCTTCGGTGTCGATCCATATCTCACGGGCATTGCCATGGGTCAAACCATCAACGGCATCCAGTCGGTAGGCGTGCAGGCGACAGCGAAGCACTATATCCTCAACGAGCAGGAGCTCAATCGAGAAACCATTTCGAGCAACCCAGATGACCGAACTCTCCATGAGCTGTATACTTGGCCATTTGCCGACGCGGTTCAGGCCAATGTCGCTTCTGTCATGTGCTCGTACAACAAGGTCAATACCACCTGGGCCTGCGAGGATCAGTACACGCTGCAGACTGTGCTGAAAGACCAGCTGGGGTTCCCAGGCTATGTCATGACGGACTGGAACGCACAGCACACGACTGTCCAAAGCGCGAATTCTGGGCTTGACATGTCAATGCCTGGCACAGACTTCAACGGTAACAATCGGCTCTGGGGTCCAGCTCTCACCAATGCGGTAAATAGCAATCAGGTCCCCACGAGCAGAGTCGACGATATGGTGACTCGTATCCTCGCCGCATGGTACTTGACAGGCCAGGACCAGGCAGGCTATCCGTCGTTCAACATCAGCAGAAATGTTCAAGGAAACCACAAGACCAATGTCAGGGCAATTGCCAGGGACGGCATCGTTCTGCTCAAGAATGACGCCAACATCCTGCCGCTCAAGAAGCCCGCTAGCATTGCCGTCGTTGGATCTGCCGCAATCATTGGTAACCACGCCAGAAACTCGCCCTCGTGCAACGACAAAGGCTGCGACGACGGGGCCTTGGGCATGGGTTGGGGTTCCGGCGCCGTCAACTATCCGTACTTCGTCGCGCCCTACGATGCCATCAATACCAGAGCGTCTTCGCAGGGCACCCAGGTTACCTTGAGCAACACCGACAACACGTCCTCAGGCGCATCTGCAGCAAGAGGAAAGGACGTCGCCATCGTCTTCATCACCGCCGACTCGGGTGAAGGCTACATCACCGTGGAGGGCAACGCGGGCGATCGCAACAACCTGGATCCGTGGCACAACGGCAATGCCCTGGTCCAGGCGGTGGCCGGTGCCAACAGCAACGTCATTGTTGTTGTCCACTCCGTTGGCGCCATCATTCTGGAGCAGATTCTTGCTCTTCCGCAGGTCAAGGCCGTTGTCTGGGCGGGTCTTCCTTCTCAGGAGAGCGGCAATGCGCTCGTCGACGTGCTGTGGGGAGATGTCAGCCCTTCTGGCAAGCTGGTGTACACCATTGCGAAGAGCCCCAATGACTATAACACTCGCATCGTTTCCGGCGGCAGTGACAGCTTCAGCGAGGGACTGTTCATCGACTATAAGCACTTCGACGACGCCAATATCACGCCGCGGTACGAGTTCGGCTATGGACTGTCTTACACCAAGTTCAACTACTCACGCCTCTCCGTCTTGTCGACCGCCAAGTCTGGTCCTGCGACTGGGGCCGTTGTGCCGGGAGGCCCGAGTGATCTGTTCCAGAATGTCGCGACAGTCACCGTTGACATCGCAAACTCTGGCCAAGTGACTGGTGCCGAGGTAGCCCAGCTGTACATCACCTACCCATCTTCAGCACCCAGGACCCCTCCGAAGCAGCTGCGAGGCTTTGCCAAGCTGAACCTCACGCCTGGTCAGAGCGGAACAGCAACGTTCAACATCCGACGACGAGATCTCAGCTACTGGGACACGGCTTCGCAGAAATGGGTGGTGCCGTCGGGGTCGTTTGGCATCAGCGTGGGAGCGAGCAGCCGGGATATCAGGCTGACGAGCACTCTGTCGGTAGCGThe codon-optimized nucleic acid sequence that encodes Cel3a is providedbelow:

(SEQ ID NO: 3) ATGCGTTATCGTACAGCCGCAGCCCTGGCACTGGCCACAGGTCCGTTCGCACGTGCCGATAGTCACAGTACCAGCGGTGCCAGCGCAGAAGCCGTGGTTCCGCCGGCAGGCACACCGTGGGGCACAGCCTATGATAAAGCCAAAGCCGCCCTGGCCAAGCTGAATCTGCAGGATAAAGTGGGCATCGTGAGTGGCGTGGGCTGGAACGGTGGTCCGTGCGTTGGCAACACCAGCCCGGCAAGCAAGATCAGCTATCCGAGCTTATGCCTGCAGGATGGTCCGCTGGGCGTGCGCTATAGCACCGGTAGTACCGCCTTTACACCTGGTGTGCAGGCCGCCAGTACCTGGGACGTTAACCTGATCCGCGAACGTGGCCAATTTATCGGCGAAGAAGTTAAAGCCAGCGGCATTCATGTTATTCTGGGTCCGGTGGCCGGTCCTCTGGGTAAAACCCCGCAGGGCGGCCGTAATTGGGAAGGCTTCGGCGTTGATCCGTATTTAACCGGCATCGCAATGGGCCAGACCATTAATGGCATCCAGAGCGTGGGTGTTCAAGCCACCGCCAAACACTACATATTAAACGAACAGGAACTGAATCGTGAAACCATCAGCAGCAATCCGGATGATCGCACCCTGCATGAGCTGTATACATGGCCTTTTGCCGACGCAGTTCAGGCCAACGTGGCAAGTGTGATGTGTAGCTATAACAAGGTGAACACCACCTGGGCCTGCGAAGACCAGTACACCCTGCAGACCGTTTTAAAAGACCAACTGGGCTTCCCTGGTTACGTGATGACAGATTGGAATGCCCAGCACACAACCGTTCAGAGCGCAAACAGTGGCCTGGATATGAGCATGCCGGGCACCGACTTCAACGGCAATAATCGTCTGTGGGGTCCGGCACTGACCAATGCCGTTAACAGCAACCAGGTGCCGACCAGTCGTGTGGACGATATGGTTACCCGTATTCTGGCCGCCTGGTACCTGACAGGTCAAGACCAGGCCGGCTACCCGAGCTTCAACATCAGCCGCAACGTGCAGGGTAATCACAAGACCAACGTTCGCGCAATCGCACGCGATGGTATCGTGCTGTTAAAGAACGATGCCAACATTCTGCCGCTGAAAAAACCGGCCAGCATCGCCGTTGTTGGTAGCGCAGCCATCATTGGCAACCACGCCCGTAACAGTCCGAGCTGCAATGATAAAGGCTGTGACGACGGTGCCCTGGGCATGGGTTGGGGTAGTGGTGCCGTGAACTACCCGTATTTCGTGGCCCCGTACGACGCCATTAACACCCGTGCAAGTAGCCAGGGTACCCAGGTTACCCTGAGCAACACCGACAACACAAGCAGCGGTGCCAGTGCAGCACGTGGTAAGGATGTGGCCATCGTGTTCATCACCGCCGACAGCGGCGAAGGCTACATTACCGTGGAGGGTAATGCCGGTGATCGCAATAATCTGGACCCGTGGCATAACGGCAACGCCCTGGTTCAGGCAGTGGCAGGCGCAAATAGCAACGTGATCGTTGTGGTGCATAGCGTGGGTGCCATCATTCTGGAGCAGATCCTGGCCCTGCCGCAAGTTAAGGCAGTTGTGTGGGCAGGTCTGCCGAGCCAAGAAAGTGGCAATGCCCTGGTGGACGTTCTGTGGGGCGATGTTAGTCCGAGCGGCAAGCTGGTGTATACAATCGCCAAGAGCCCGAACGACTATAACACCCGCATCGTTAGCGGCGGCAGTGATAGCTTCAGCGAGGGCCTGTTTATCGACTACAAGCATTTCGATGATGCCAATATTACCCCGCGCTACGAATTTGGTTATGGCCTGAGCTATACCAAGTTCAACTACAGCCGCCTGAGCGTTTTAAGTACCGCCAAGAGTGGTCCGGCAACAGGTGCCGTGGTTCCTGGTGGTCCGAGTGATCTGTTTCAGAATGTGGCCACCGTGACCGTGGATATCGCCAACAGTGGTCAGGTTACCGGCGCCGAAGTGGCACAGCTGTACATCACCTATCCGAGCAGTGCACCGCGCACCCCGCCGAAACAGCTGCGTGGCTTCGCCAAATTAAACCTGACCCCGGGCCAGAGCGGTACAGCAACCTTCAATATTCGCCGCCGTGATCTGAGCTATTGGGACACCGCCAGCCAAAAATGGGTGGTGCCGAGCGGCAGCTTTGGCATTAGTGTGGGTGCAAGTAGCCGCGACATTCGCTTAACAAGCACCCTGAGTGTTGCC

In an embodiment, the nucleic acid sequence encoding a Cel3a enzyme orfunctional variant thereof comprises at least 50%, at least 55%, atleast 60%, at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 91%, at least 92%, at least 93%, atleast 94%, at least 95%, at least 96%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 2. In an embodiment, the nucleic acidsequence encoding a Cel3a enzyme or functional variant thereof comprisesat least 50%, at least 55%, at least 60%, at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 91%, atleast 92%, at least 93%, at least 94%, at least 95%, at least 96%, atleast 97%, at least 98%, or at least 99% identity to SEQ ID NO:2 or SEQID NO: 3.

Also provided herein is a nucleic acid sequence encoding anaglycosylated polypeptide having biomass-degrading activity describedherein, e.g., Cel3a polypeptide, in which one or more glycoslyationsites present in the polypeptide has been mutated such that a glycan canno longer be attached or linked to the glycosylation site. In anotherembodiment, the nucleic acid sequence described herein encoding anaglycosylated polypeptide, e.g., a Cel3a polypeptide, as describedabove, in which one or more mutations proximal to one or moreglycosylation sites present in the polypeptide has been mutated suchthat a glycan can no longer be attached or linked to the glycosylationsite, as previously described.

The techniques used to isolate or clone a nucleic acid sequence encodinga polypeptide are known in the art and include isolation from genomicDNA, preparation from cDNA, or a combination thereof. The cloning of thenucleic acid sequences of the present invention from such genomic DNAcan be effected, e.g., by using the well known polymerase chain reaction(PCR) or antibody screening of expression libraries to detect cloned DNAfragments with shared structural features. See, e.g., Innis et al.,1990, PCR: A Guide to Methods and Application, Academic Press, New York.Other amplification procedures such as ligase chain reaction (LCR),ligated activated transcription (LAT) and nucleotide sequence-basedamplification (NASBA) may be used. The nucleic acid sequence may becloned from a strain of Trichoderma reesei, e.g., wild-type T. reesei,or T. reesei RUTC30, or another or related organism and thus, forexample, may be an allelic or species variant of the polypeptideencoding region of the nucleic acid sequence.

The nucleic acid sequence may be obtained by standard cloning proceduresused in genetic engineering to relocate the nucleic acid sequence fromits natural location to a different site where it will be reproduced.The cloning procedures may involve excision and isolation of a desiredfragment comprising the nucleotide sequence encoding the polypeptide,insertion of the fragment into a vector molecule, and incorporation ofthe recombinant vector into a host cell where multiple copies or clonesof the nucleotide sequence will be replicated. The nucleotide sequencemay be of genomic, cDNA, RNA, semisynthetic, synthetic origin, or anycombinations thereof.

As used herein, an “expression vector” is a nucleic acid construct forintroducing and expressing a nucleic acid sequence of interest into ahost cell. In some embodiments, the vector comprises a suitable controlsequence operably linked to and capable of effecting the expression ofthe polypeptide encoded by the nucleic acid sequence described herein.The control sequence may be an appropriate promoter sequence, recognizedby a host cell for expression of the nucleic acid sequence. In anembodiment, the nucleic acid sequence of interest is a nucleic acidsequence encoding a polypeptide having biomass-degrading activity, e.g.,cellobiase activity, as described herein.

A promoter in the expression vector of described herein can includepromoters obtained from genes encoding extracellular or intracellularpolypeptides either homologous or heterologous to the host cell, mutantpromoters, truncated promoters, and hybrid promoters.

Examples of suitable promoters for directing transcription of thenucleic acid constructs of the present invention in a bacterial hostcell are the promoters obtained from the E. coli lac operon, E. coli tacpromoter (hybrid promoter, DeBoer et al, PNAS, 1983, 80:21-25), E. colirec A, E. coli araBAD, E. coli tetA, and prokaryotic beta-lactamase.Other examples of suitable promoters include viral promoters, such aspromoters from bacteriophages, including a T7 promoter, a T5 promoter, aT3 promoter, an M13 promoter, and a SP6 promoter. In some embodiments,more than one promoter controls the expression of the nucleic acidsequence of interest, e.g., an E. coli lac promoter and a T7 promoter.Further promoters that may be suitable for use in the present inventionare described in “Useful proteins from recombinant bacteria” inScientific American, 1980, 242:74-94, and Sambrook et al., MolecularCloning: A Laboratory Manual, 1989. In some preferred embodiments, thepromoter is inducible, where the addition of a molecule stimulates thetranscription and expression of the downstream reading frame.

Examples of suitable promoters for directing transcription of thenucleic acid constructs of the present invention in a eukaryotic hostcell, e.g., in a fungal or yeast cell are promoters obtained from thegenes of Trichoderma Reesei, methanol-inducible alcohol oxidase (AOXpromoter), Aspergillus nidulans tryptophan biosynthesis (trpC promoter),Aspergillus niger var. awamori flucoamylase (glaA), Saccharomycescerevisiae galactokinase (GALl), or Kluyveromyces lactis Plac4-PBIpromoter.

A control sequence present in the expression vector described herein mayalso be a signal sequence that codes for an amino acid sequence linkedto the amino terminus of a polypeptide and directs the encodedpolypeptide into the cell's secretory pathway, e.g., a secretion signalsequence. The signal sequence may be an endogenous signal sequence,e.g., where the signal sequence is present at the N-terminus of thewild-type polypeptide when endogenously expressed by the organism fromwhich the polypeptide of interest originates from. The signal sequencemay be a foreign, or heterologous, signal peptide, in which the signalsequence is from a different organism or a different polypeptide thanthat of the polypeptide of interest being expressed. Any signal sequencewhich directs the expressed polypeptide into the secretory pathway of ahost cell may be used in the present invention. Typically, signalsequences are composed of between 6 and 136 basic and/or hycrophobicamino acids.

Examples of signal sequences suitable for the present invention includethe signal sequence from Saccharomyces cerevisiae alpha-factor.

Fusion tags may also be used in the expression vector described hereinto facilitate the detection and purification of the expressedpolypeptide. Examples of suitable fusion tags include His-tag (e.g., 3×His, 6× His (SEQ ID NO: 22), or 8× His (SEQ ID NO: 21)), GST-tag,HSV-tag, S-tag, T7 tag. Other suitable fusion tags include myc tag,hemagglutinin (HA) tag, and fluorescent protein tags (e.g., greenfluorescent protein). The fusion tag is typically operably linked to theN or C terminus of the polypeptide to be expressed. In some embodiments,there may be a linker region between the fusion tag sequence and theN-terminus or C-terminus of the polypeptide to be expressed. In anembodiment, the linker region comprises a sequence between 1 to 20 aminoacids, that does not affect or alter the expression or function of theexpressed polypeptide.

Utilization of the fusion tags described herein allows detection of theexpressed protein, e.g., by western blot by using antibodies thatspecifically recognize the tag. The tags also allows for purification ofthe expressed polypeptide from the host cell, e.g., by affinitychromatography. For example, an expressed polypeptide fused to a His-tagcan be purified by using nickel affinity chromatography. The His tag hasaffinity for the Nickel ions, and a nickel column will retain thehis-tagged polypeptide, while allowing all other proteins and celldebris to flow through the column. Elution of the His-tagged polypeptideusing an elution buffer, e.g., containing imidazole, releases theHis-tagged polypeptide from the column, resulting in substantiallypurified polypeptide.

The expression vector described herein may further comprise a selectablemarker gene to enable isolation of a genetically modified microbetransformed with the construct as is commonly known to those of skill inthe art. The selectable marker gene may confer resistance to anantibiotic or the ability to grow on medium lacking a specific nutrientto the host organism that otherwise could not grow under theseconditions. The present invention is not limited by the choice ofselectable marker gene, and one of skill in the art may readilydetermine an appropriate gene. For example, the selectable marker genemay confer resistance to ampicillin, chloramphenicol, tetracycline,kanamycin, hygromycin, phleomycin, geneticin, or G418, or may complementa deficiency of the host microbe in one of the trp, arg, leu, pyr4, pyr,ura3, ura5, his, or ade genes or may confer the ability to grow onacetamide as a sole nitrogen source.

The expression vector described herein may further comprise othernucleic acid sequences, e.g., additional control sequences, as iscommonly known to those of skill in the art, for example,transcriptional terminators, synthetic sequences to link the variousother nucleic acid sequences together, origins of replication, ribosomebinding sites, a multiple cloning site (or polylinker site), apolyadenylation signal and the like. The ribosomal binding site suitablefor the expression vector depends on the host cell used, for example,for expression in a prokaryotic host cell, a prokaryotic RBS, e.g., a T7phage RBS can be used. A multiple cloning site, or polylinker site,contains one or more restriction enzyme sites that are preferably notpresent in the remaining sequence of the expression vector. Therestriction enzyme sites are utilized for the insertion of a nucleicacid sequence encoding a polypeptide having cellobiase activity or otherdesired control sequences. The practice of the present invention is notlimited by the presence of any one or more of these other nucleic acidsequences, e.g., other control sequences.

Examples of suitable expression vectors for use in the present inventioninclude vectors for expression in prokaryotes, e.g., bacterialexpression vectors. A bacterial expression vector suitable for use inthe present invention in the pET vector (Novagen), which contains thefollowing: a viral T7 promoter which is specific to only T7 RNApolymerase (not bacterial RNA polymerase) and also does not occuranywhere in the prokaryotic genome, a lac operator comprising a lacpromoter and coding sequence for the lac repressor protein (lacI gene),a polylinker, an f1 origin of replication (so that a single-strandedplasmid can be produced when co-infected with M13 helper phage), anampicillin resistance gene, and a ColE1 origin of replication (Blaber,1998). Both the promoter and the lac operator are located 5′, orupstream, of the polylinker in which the nucleic acid sequence encodinga polypeptide described herein is inserted. The lac operator confersinducible expression of the nucleic acid sequence encoding a polypeptidehaving cellobiase activity. Addition of IPTG (Isopropyl3-D-1-thiogalactopyranoside), a lactose metabolite, triggerstranscription of the lac operon and induces protein expression of thenucleic acid sequence under control of the lac operator. Use of thissystem requires the addition of T7 RNA polymerase to the host cell forvector expression. The T7 RNA polymerase can be introduced via a secondexpression vector, or a host cell strain that is genetically engineeredto express T7 RNA polymerase can be used.

An exemplary expression vector for use with the invention is a pETvector, commercially available from Novagen. The pET expression systemis described in U.S. Pat. Nos. 4,952,496; 5,693,489; and 5,869,320. Inone embodiment, the pET vector is a pET-DUET vector, e.g., pET-Duet1,commercially available from Novagen. Other vectors suitable for use inthe present invention include vectors containing His-tag sequences, suchas those described in U.S. Pat. Nos. 5,310,663 and 5,284,933; andEuropean Patent No. 282042.

The present invention also relates to a host cell comprising the nucleicacid sequence or expression vector of the invention, which are used inthe recombinant production of the polypeptides having biomass-degradingactivity.

An expression vector comprising a nucleic acid sequence of the presentinvention is introduced into a host cell so that the vector ismaintained (e.g., by chromosomal integration or as a self-replicatingextra-chromosomal vector) such that the polypeptide is expressed.

The host cell may be a prokaryote or a eukaryote. The host cell may be abacteria, such as an E. coli strain, e.g., K12 strains NovaBlue,NovaBlue T1R, JM109, and DH5a. Preferably, the bacteria cell has thecapability to fold, or partially fold, exogenously expressed proteins,such as E. coli Origami strains, e.g., Origami B, Origami B (DE3),Origami 2, and Origami 2(DE3) strains. In some embodiments, it may bepreferred to use a host cell that is deficient for glycosylation, or hasan impaired glycosylation pathway such that proteins expressed by thehost cell are not significantly glycosylated.

The host cell may be a yeast or a filamentous fungus, particularly thoseclassified as Ascomycota. Genera of yeasts useful as host microbes forthe expression of modified TrCel3A beta-glucosidases of the presentinvention include Saccharomyces, Pichia, Hansenula, Kluyveromyces,Yarrowia, and Arxula. Genera of fungi useful as microbes for theexpression of the polypeptides of the present invention includeTrichoderma, Hypocrea, Aspergillus, Fusarium, Humicola, Neurospora,Chrysosporium, Myceliophthora, Thielavia, Sporotrichum and Penicillium.For example, the host cell may be Pichia pastoris. For example, the hostcell may be an industrial strain of Trichoderma reesei, or a mutantthereof, e.g., T. reesei RUTC30. Typically, the host cell is one whichdoes not express a parental biomass-degrading enzyme, e.g., cellobiaseor Cel3a.

The selection of the particular host cell, e.g., bacterial cell or afungal cell, depends on the expression vector (e.g., the controlsequences) and/or the method utilized for producing an aglycosylatedpolypeptide of the invention, as described in further detail below.

The expression vector of the invention may be introduced into the hostcell by any number of methods known by one skilled in the art ofmicrobial transformation, including but not limited to, transformation,treatment of cells with CaCl₂, electroporation, biolistic bombardment,lipofection, and PEG-mediated fusion of protoplasts (e.g. White et al.,WO 2005/093072, which is incorporated herein by reference). Afterselecting the recombinant host cells containing the expression vector(e.g., by selection utilizing the selectable marker of the expressionvector), the recombinant host cells may be cultured under conditionsthat induce the expression of the polypeptide having biomass-degradingactivity of the invention.

Methods for recovering the soluble polypeptides having biomass-degradingactivity expressed from prokaryote and eukaryote cells are known in theart. In embodiments, the method for recovering the polypeptide comprisescollecting the cells, e.g., by centrifugation or filtration, and lysingthe cells, e.g., by mechanical, chemical, or enzymatic means. Forexample, cells can be physically broken apart, e.g., by sonication,milling (shaking with beads), or shear forces. Cell membranes can betreated such that they are permeabilized such that the contents of thecells are released, such as treatment with detergents, e.g., Triton,NP-40, or SDS. Cells with cell walls, e.g., bacterial cells, can bepermeabilized using enzymes, such as a lysozyme or lysonase. Anycombination of the mechanical, chemical, and enzymatic techniquesdescribed above are also suitable for recovering expressed polypeptidesof interest from the host cell in the context of this invention. Forexample, when expressing a polypeptide having biomass-degrading activitydescribed herein in a bacterial cell, e.g., an E. coli cell, the cell istypically collected by centrifuging and pelleting the cell culture, andlysed by resuspending the cell pellet in a lysis buffer containinglysozyme. To ensure complete lysis, the resuspended cells are subjectedto one of the following methods: sonication, milling, or homogenization.After centrifugation, the soluble polypeptides having biomass-degradingactivity are present in the supernatant, while the insolublepolypeptides having biomass-degrading activity, e.g., in inclusionbodies, are found in the pellet. Methods for recovering the insolublepolypeptides having biomass-degrading activity, e.g., from inclusionbodies, are described further below in the section titled“Solubilization from Inclusion Bodies”.

The soluble polypeptides having biomass-degrading activity can then bepurified or isolated from the cell lysate using standard methods knownin the art. For polypeptides having biomass-degrading activitycomprising a tag, e.g., a His tag, affinity chromatography can be usedto separate the soluble polypeptides from the remainder of the solublefraction of the lysate.

In one embodiment, the host cell expressing a polypeptide havingbiomass-degrading activity described herein is not lysed before additionto the biomass for the saccharification reaction. In some instances, themethods for lysing host cells and extracting the polypeptides havingbiomass-degrading activity can result in protein denaturation and/ordecreased enzyme activity, which leads to increased cost of downstreamprocessing. Thus, the present invention also provides methods fordirectly adding the host cells expressing an aglycosylated polypeptidehaving biomass-degrading activity described herein to the biomass priorto the saccharification step.

In an embodiment, the host cell, e.g., the E. coli cell, expressing apolypeptide having biomass-degrading activity described herein isisolated, e.g., by centrifugation, and added to the saccharificationreaction, e.g., the saccharification reactor containing biomass. Thecells are lysed by a combination of shear from the biomass, theimpellers, and the increased temperature. In an embodiment, the cultureof host cell, e.g., the E. coli cell, expressing the polypeptide havingbiomass-degrading activity described herein is added directly from thefermentation tank directly to the saccharification tank and eliminatingthe need to pellet cells by centrifugation. In an embodiment, thepolypeptide is glycosylated or aglycosylated.

Solubilization from Inclusion Bodies

In embodiments, a cell, e.g., a microorganism disclosed herein, has beengenetically modified using methods described herein to produce at leastone polypeptide having a biomass-degrading activity. At least a portion,e.g., at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90%, of thepolypeptide having biomass-degrading activity is found in inclusionbodies in the genetically modified cell. Disclosed herein are methodsfor solubilizing the at least one polypeptide having biomass-degradingactivity from the inclusion bodies.

Inclusion bodies are insoluble aggregates in host cells comprisingheterologously expressed proteins, e.g., a polypeptide having abiomass-degrading activity, when expressed at high levels. Inclusionbodies can be found in the nucleus or the cytoplasm. Inclusion bodiescan also contain other components, such as other proteins endogenous tothe host cell, e.g., host proteins, ribosomal components, nucleic acids(e.g., RNA and/or DNA), and cellular debris (e.g., cell wall debris,lipids, metabolites). Proteins endogenous to the host cell includes anyprotein that is encoded by genomic DNA of the host cell. The proteinendogenous to the host cell may be localized to the cytoplasm, or mayinteract with the heterologously expressed protein. Examples ofribosomal components that can be found in an inclusion body includeribosomes, fragments of ribosomal subunits (e.g., 50 S subunit or 30 Ssubunit), partially translated polypeptides, transfer RNA, elongationfactors, and/or messenger RNA. Examples of nucleic acids that can befound in an inclusion body include genomic DNA of the host, exogenousDNA (e.g., from a plasmid or expression vector introduced into the hostcell), messenger RNA, transfer RNA, ribosomal RNA, or any fragmentsthereof. Examples of cellular debris that can be found in an inclusionbody include cell wall or membrane debris (e.g., fragments or componentsof the cell wall or membrane), nuclear membrane debris (e.g., fragmentsor components of the nuclear membrane), fragments or components of otherhost organelles, endotoxins, lipids, and/or metabolites.

Additional methods for reducing aggregation of inclusion bodies includesonication, incubation at varying temperatures, acid/base treatment,protease treatment, electrical treatment, mechanical treatment, andaddition of organisms that produce proteases. For example, the cells orlysates thereof containing inclusion bodies are incubated attemperatures ranging from −20° C. to 0° C., 0° C. to 4° C., 4° C. to 20°C., 20° C. to 40° C., and 40° C. to 80° C.

To isolate the inclusion bodies, the host cell expressing a polypeptidehaving biomass-degrading activity is first lysed, using standard methodsin the art, such as lysis by lysozyme or other denaturing agents,ultrasound treatment, sonication, or high pressure homogenization. Thehost cells are lysed under conditions that do not lead to solubilizationof an inclusion body. Inclusion bodies are isolated from the host cellusing techniques known in the art. For example, the cell lysate isseparated such that the inclusion bodies containing polypeptides havingbiomass-degrading activity and other insoluble matter are present in aninsoluble fraction, while the soluble fraction contains the solublepolypeptides having biomass-degrading activity. Such separation can beaccomplished through centrifugation, whereby the inclusion bodies arefound in the pellet, e.g., the insoluble fraction, and the solublepolypeptides are found in the supernatant, e.g., the soluble fraction.Other methods suitable for separation of an insoluble fraction from thesoluble fraction include filtration.

Solubilization of the inclusion bodies to release a polypeptide havingbiomass-degrading activity comprises adding a solubilizing agent to theinsoluble fraction or inclusion bodies. In some embodiments asolubilizing agent can be an agent that prevents protein aggregation orprecipitation, or dissolves protein aggregates. In some embodiments, thesolubilizing agent includes an agent that disrupts van der Waalsinteractions, hydrophobic interactions, hydrogen bonding, dipole-dipoleinteractions, ionic interactions, pi stacking, or any combinationthereof.

In some embodiments, the solubilizing agent can be an agent thatdisrupts hydrophobic interactions, e.g., such as a detergent. Exemplarydetergents include nonionic, zwitterionic, anionic and cationicdetergents. In some embodiments, the solubilizing agent can be nonionic,e.g., NP-40 and Triton X-100. In some embodiments, the solubilizingagent can be zwitterionic, e.g., CHAPS and sulfobetaines, e.g., SB3-10or ASB 14. In some embodiments, the protein agent can be anionic, e.g.,sodium dodecyl sulfate (SDS).

In some embodiments, the solubilizing agent can be an agent that reducesdisulfide bonds, e.g., a thiol reducing agent. Exemplary thiol reducingagents include 2-mercaptoethanol βME and dithiothreitol (DTT). In someembodiments, the solubilizing agent that reduces disulfide bonds can bea phosphine, e.g., tributylphosphine (TBP) or triscarboxyethylphosphine(TCEP).

In some embodiments, the solubilizing agent can be an agent thatdisrupts hydrogen bonding and hydrophobic interactions. In someembodiments, the solubilizing agent can be a chaotropic compound, e.g.,urea and substituted ureas (e.g., thiourea), and guanidiniumhydrochloride.

In some embodiments, the solubilizing agent can be an agent that is anonpolar solvent. Nonpolar solvents contain bonds between atoms withsimilar electronegativities, such as carbon and hydrogen, and have verylow dielectric constants. For example, nonpolar solvents have adielectric constant of less than 5. Examples of nonpolar solventsinclude pentane, hexane, cyclohexane, benzene, toluene, chloroform,diethyl ether.

In some embodiments, the solubilizing agent can be an agent that is apolar solvent. Polar solvents are characterized by having large dipolemoments (or “partial charges”); they contain bonds between atoms withvery different electronegativities, such as oxygen and hydrogen. In oneembodiment, the polar solvents suitable for use in the invention hereinhave a dielectric constant of at least 5, or at least 20. In a preferredembodiment, the polar solvents have a high dielectric constant, e.g., adielectric constant greater than 25. In some embodiments, thesolubilizing agent is a protic polar solvent, which has O—H or N—Hbonds, have high dielectric constants, e.g., greater than 20, greaterthan 25, and are good hydrogen bond donors, e.g., formic acid,n-butanol, isopropanol, n-propanol, ethanol, methanol, or nitromethane.In some embodiments, the solubilizing agent can be an aprotic polarsolvent, which lack O—H or N—H bonds, and has a dielectric constantbetween 5 and 20, e.g., dimethylsulfoxide (DMSO), dichloromethane (DCM),tetrahydrofuran (THF), ethyl acetate, acetone, dimethylformamide (DMF),or acetonitrile (MeCN).

In some embodiments, the solubilizing agent can be an agent that has apositive charge, which may be suitable for disrupting ionic interactionsof a net negatively charged molecule. Exemplary positively chargedsolubilizing agents include N-methyl D-glucamine, choline, arginine,lysine, procaine, tromethamine (TRIS), spermine, N-methyl-morpholine,glucosamine, N,N-bis 2-hydroxyethyl glycine, diazabicycloundecene,creatine, arginine ethyl ester, amantadine, rimantadine, ornithine,taurine, and citrulline. Cationic moieties may additionally includesodium, potassium, calcium, magnesium, ammonium, monoethanolamine,diethanolamine, triethanolamine, tromethamine, lysine, histidine,arginine, morpholine, methylglucamine, and glucosamine.

In some embodiments, the solubilizing agent can be an agent that has anegative charge, which may be suitable for disrupting ionic interactionsof a net positively charged molecule. Exemplary negatively chargedsolubilizing agents include acetate, propionate, butyrate, pentanoate,hexanoate, heptanoate, levulinate, chloride, bromide, iodide, citrate,succinate, maleate, glycolate gluconate, glucuronate,3-hydroxyisobutyrate, 2-hydroxyisobutyrate, lactate, malate, pyruvate,fumarate, tartarate, tartronate, nitrate, phosphate, benzene sulfonate,methane sulfonate, sulfate, sulfonate, acetic acid, adamantoic acid,alpha keto glutaric acid, D- or L-aspartic acid, benzensulfonic acid,benzoic acid, 10-camphorsulfunic acid, citric acid, 1,2-ethanedisulfonicacid, fumaric acid, D-gluconic acid, D-glucuronic acid, glucaric acid,D- or L-glutamic acid, glutaric acid, glycolic acid, hippuric acid,hydrobromic acid, hydrochloric acid, 1-hydroxyl-2-napthoic acid,lactobioinic acid, maleic acid, L-malic acid, mandelic acid,methanesulfonic acid, mucic acid, 1,5 napthalenedisulfonic acidtetrahydrate, 2-napthalenesulfonic acid, nitric acid, oleic acid, pamoicacid, phosphoric acid, p-toluenesulfonic acid hydrate, D-saccharide acidmonopotassium salt, salicyclic acid, stearic acid, succinic acid,sulfuric acid, tannic acid, and D- or L-tartaric acid.

The solubilizing agent is added to an inclusion body, or a fractioncontaining inclusion bodies, at a sufficient concentration to solubilizea polypeptide having biomass degrading activity from the inclusion body,for example, at a concentration of about 0.01-10M, about 0.05-10M, about0.1-10M, about 0.2-10M, about 0.5-10M, about 1-10M, about 2-10M, about5-10M, about 8-10M, about 0.01-6M, about 0.05-6M, about 0.1-6M, about0.2-6M, about 0.5-6M, about 1-6M, about 2-6M, about 4-6M, or about 5-6M.In an embodiment, the solubilizing agent is added to an inclusion body,or a fraction containing inclusion bodies, at a concentration of about0.01M, about 0.02M, about 0.05M, about 0.1M, about 0.2M, about 0.5M,about 1M, about 2M, about 3M, about 4M, about 5M, about 6M, about 7M,about 8M, about 9M, or about 10M.

In a preferred embodiment, the solubilizing agent is urea, and is addedto an inclusion body, or a fraction containing inclusion bodies, at aconcentration of about 0.01-10M, about 0.05-10M, about 0.1-10M, about0.2-10M, about 0.5-10M, about 1-10M, about 2-10M, about 5-10M, about8-10M, about 0.01-6M, about 0.05-6M, about 0.1-6M, about 0.2-6M, about0.5-6M, about 1-6M, about 2-6M, about 4-6M, or about 5-6M. In anembodiment, urea is added to an inclusion body, or a fraction containinginclusion bodies, at a concentration of about 0.01M, about 0.02M, about0.05M, about 0.1M, about 0.2M, about 0.5M, about 1M, about 2M, about 3M,about 4M, about 5M, about 6M, about 7M, about 8M, about 9M, or about10M. In a preferred embodiment, the urea is added to an inclusion body,or a fraction containing inclusion bodies, at a concentration of 6M.

After solubilization using a solubilizing agent, the resulting mixturecontains a solubilized polypeptide having biomass-degrading activity, asdescribed herein. The resulting mixture can be used directly in anenzymatic processes, such as a reaction for producing products, e.g., asaccharification reaction, as described in further detail in the sectiontitled “Methods of Producing Products Using Solubilized Enzymes”. Inthis embodiment, the mixture may contain other components of theinclusion body, such as other proteins endogenous to the host cell,ribosomal components, nucleic acids (e.g., RNA and/or DNA), and cellulardebris (e.g., cell wall debris, lipids, metabolites).

In other embodiments, the resulting mixture containing a solubilizedpolypeptide having biomass-degrading activity is further processed topurify or isolate the solubilized polypeptide having biomass-degradingactivity from the other solubilized components of the inclusion bodies.Suitable methods for isolating or purifying the solubilized polypeptidehaving biomass-degrading activity include affinity purificationtechniques. For example, the polypeptide having biomass-degradingactivity preferably contains a tag or fusion peptide that can beutilized for affinity purification. In an embodiment, the polypeptidehaving biomass-degrading activity contains a His-tag, e.g., an 8× Histag (SEQ ID NO: 21), and the solubilized polypeptide havingbiomass-degrading activity can be purified using nickel affinitychromatography, e.g., an immobilized metal ion affinity chromatography(IMAC) system. In one embodiment, all purification steps occur in thepresence of the solubilizing agent used to solubilize the polypeptidefrom an inclusion body, including in the washing and elution steps ofthe purification process. Accordingly, in one embodiment, the resultingpurified solubilized polypeptide having biomass-degrading activity alsocontains a solubilizing agent.

The present invention provides a mixture comprising a polypeptide havingbiomass-degrading activity and a solubilizing agent. The mixture isobtained through the solubilization of inclusion bodies, as describedabove. The resulting mixture may also further comprise one or moreproteins associated with the inclusion bodies. The solubilizedpolypeptide having biomass-degrading activity may also be purified byaffinity purification techniques. In this case, the resulting mixturedoes not comprise one or more proteins associated with the inclusionbodies. The mixture can comprise other components found in inclusionbodies, such as other proteins endogenous to the host cell, ribosomalcomponents, nucleic acids (e.g., RNA and/or DNA), and cellular debris(e.g., cell wall debris, lipids, metabolites).

In an embodiment, the solubilized polypeptide having biomass-degradingactivity can be partially unfolded, partially misfolded, or partiallydenatured. In an embodiment, the solubilized polypeptide havingbiomass-degrading activity has at least 1%, at least 2%, at least 3%, atleast 4%, at least 5%, at least 6%, at least 7%, at least 8%, at least9%, at least 10%, at least 8-10%, at least 15%, at least 20%, at least25%, at least 30%, at least 35%, at least 40%, at least 45%, at least50%, at least 55%, at least 60%, at least 65%, at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least100% biomass-degrading activity compared to the native polypeptidehaving biomass-degrading activity. In an embodiment, the solubilizedpolypeptide having biomass-degrading activity has about 1-10%, 1-20%,1-30%, 1-40%, 1-50%, 1-60%, 1-70%, 1-80%, 1-90%, 1-100%, 10-20%, 10-30%,10-40%, 10-50%, 10-60%, 10-70%, 10-80%, 10-90%, 10-100%, 20-30%, 20-40%,20-50%, 20-60%, 20-70%, 20-80%, 20-90%, 20-100%, 30-40%, 30-50%, 30-60%,30-70%, 30-80%, 30-90%, 30-100%, 40-50%, 40-60%, 40-70%, 40-80%, 40-90%,40-100%, 50-60%, 50-70%, 50-80%, 50-90%, 50-100%, 60-70%, 60-80%,60-90%, 60-100%, 70-80%, 70-90%, 70-100%, 80-90%, 80-100%, or 90-100% ofthe biomass-degrading activity compared to the native polypeptide havingbiomass-degrading activity. In a preferred embodiment, the solubilizedpolypeptide having biomass-degrading activity has at least 8-10% of theactivity of the native polypeptide. The native polypeptide havingbiomass-degrading activity refers to, e.g., the correspondingpolypeptide having biomass-degrading activity isolated from the solublefraction, the corresponding polypeptide having biomass-degradingactivity that is properly folded in its native form (thereby having 100%biomass-degrading activity), or the corresponding polypeptide havingbiomass-degrading activity endogenously expressed from the microorganismfrom which the polypeptide originates from. Biomass-degrading activitycan be determined by any of the assays described herein, e.g., aligninase activity assay, an endoglucanase activity assay, acellobiohydrolase activity assay, a cellobiase activity assay, or axylanase activity assay.

In one aspect, the mixture comprises a polypeptide having cellobiaseactivity, e.g., a Cel3a or a functional variant thereof from T. reesei,e.g., a polypeptide with at least 90% identity to SEQ ID NO: 1, and asolubilizing agent, e.g., urea. In one embodiment, the mixture comprisesa polypeptide having at least 90% identity to SEQ ID NO: 1 and asolubilizing agent, e.g., urea, wherein the polypeptide has at least 20%of the cellobiase activity compared to the native polypeptide, e.g., SEQID NO: 1 or Cel3a from T. reesei. The mixture is obtained through thesolubilization of inclusion bodies, as described above. The resultingmixture may also further comprise one or more proteins associated withthe inclusion bodies. The solubilized polypeptide having cellobiaseactivity, e.g., a polypeptide with at least 90% identity to SEQ ID NO:1, may also be purified by affinity purification techniques. In thiscase, the resulting mixture does not comprise one or more proteinsassociated with the inclusion bodies. The mixture can comprise othercomponents found in inclusion bodies, such as other proteins endogenousto the host cell, ribosomal components, nucleic acids (e.g., RNA and/orDNA), and cellular debris (e.g., cell wall debris, lipids, metabolites).In an embodiment, the solubilizing agent is urea, and the urea ispresent at 0.2-6M.

In an embodiment, the solubilized polypeptide having cellobiaseactivity, or at least 90% identity with SEQ ID NO: 1, can be partiallyunfolded, partially misfolded, or partially denatured. In an embodiment,the solubilized polypeptide having cellobiase activity, or at least 90%identity with SEQ ID NO: 1, has at least 1%, at least 2%, at least 3%,at least 4%, at least 5%, at least 6%, at least 7%, at least 8%, atleast 9%, at least 10%, at least 8-10%, at least 15%, at least 20%, atleast 25%, at least 30%, at least 35%, at least 40%, at least 45%, atleast 50%, at least 55%, at least 60%, at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or atleast 100% cellobiase activity compared to the native polypeptide. In anembodiment, the solubilized polypeptide having cellobiase activity, orat least 90% identity with SEQ ID NO: 1, has about 1-10%, 1-20%, 1-30%,1-40%, 1-50%, 1-60%, 1-70%, 1-80%, 1-90%, 1-100%, 10-20%, 10-30%,10-40%, 10-50%, 10-60%, 10-70%, 10-80%, 10-90%, 10-100%, 20-30%, 20-40%,20-50%, 20-60%, 20-70%, 20-80%, 20-90%, 20-100%, 30-40%, 30-50%, 30-60%,30-70%, 30-80%, 30-90%, 30-100%, 40-50%, 40-60%, 40-70%, 40-80%, 40-90%,40-100%, 50-60%, 50-70%, 50-80%, 50-90%, 50-100%, 60-70%, 60-80%,60-90%, 60-100%, 70-80%, 70-90%, 70-100%, 80-90%, 80-100%, or 90-100% ofthe cellobiase activity compared to the native polypeptide. The nativepolypeptide is, for example, SEQ ID NO: 1 that is properly folded (e.g.,100% folded), Cel3a that is isolated from T. reesei, or a functionalvariant thereof. Cellobiase activity can be measured using the assaysdescribed herein, and can be quantified as the concentration of glucose(g/L) released after 30 minutes or the % of cellobiose converted toglucose in 30 minutes.

Methods for Producing Aglycosylated Polypeptides

The present invention further provides methods for producing anaglycosylated polypeptide having biomass-degrading activity in a hostcell, wherein the host cell, or lysate thereof, is treated with asolubilizing agent at a concentration suitable for solubilizing theaglycosylated polypeptide, as described herein. The method comprisesculturing the host cell expressing the polypeptide havingbiomass-degrading activity under conditions suitable for the expressionof the polypeptide. The method may also comprise recovering theaglycosylated polypeptide having biomass-degrading activity from thehost cell. In the methods described in further detail below, thepolypeptide having biomass-degrading activity has, e.g., ligninaseactivity, endoglucanase activity, cellobiohydrolase activity, cellobiaseactivity, or xylanase activity. In an embodiment, the polypeptide havingcellobiase activity comprises a Cel3a from T. reesei, or a functionalfragment thereof. In another embodiment, the polypeptide havingcellobiase activity comprises SEQ ID NO: 1.

Using a Host Cell Deficient for Glycosylation

In embodiments, the expression vector comprises a nucleic acid sequenceencoding a polypeptide having biomass-degrading activity describedherein operably linked to a fusion tag is introduced to and expressed ina cell that does not significantly glycosylate proteins expressed in thecell, e.g., a bacterial host cell. The recombinant host cell is culturedunder conditions for expression of the polypeptide, resulting in theproduction of an aglycosylated polypeptide having biomass-degradingactivity. The aglycosylated polypeptide can be purified or isolated fromthe host cell using affinity chromatography methods for the fusion tagas described herein.

For example, in this embodiment, the expression vector contains a lacoperator and a T7 promoter upstream of the nucleic acid sequenceencoding a polypeptide having biomass-degrading activity, and the hostcell has the capacity to express T7 RNA polymerase. Expression of thepolypeptide having biomass-degrading activity is induced by addition ofIPTG. Preferably, the host cell is an E. coli cell, preferably an E.coli Origami cell. In this embodiment, the fusion tag is a His-tag, andthe purification of the expressed aglycosylated polypeptide comprisesnickel affinity chromatography.

Using a Host Cell with the Capacity for Glycosylation

In another embodiment, an expression vector comprising a nucleic acidsequence encoding a polypeptide comprising one or more glycosylationsite mutations such that the polypeptide is not glycosylated, asdescribed herein, is expressed in a host cell, wherein the host cell iscapable of glycosylating proteins expressed within the cell, e.g., ayeast or fungal host cell. Alternatively, the host cell is not capableof glycosylating proteins expressed within the cell, e.g., a bacterialhost cell. In this embodiment, the polypeptide is operably linked to afusion tag. The aglycosylated polypeptide can be purified or isolatedfrom the bacterial host cell using affinity chromatography methods forthe fusion tag as described herein.

In yet another embodiment, an expression vector comprising a nucleicacid sequence encoding a polypeptide having biomass-degrading activitydescribed herein is expressed in a host cell, wherein the host cell iscapable of glycosylating proteins expressed within the cell. The cellsare cultured under conditions sufficient for expression andglycosylation of the polypeptide. In this embodiment, the polypeptide isoperably linked to a fusion tag. The glycosylated polypeptide can bepurified or isolated from the bacterial host cell using affinitychromatography methods for the fusion tag as described herein. Afterpurification from the host cells and other endogenous host enzymes,e.g., glycosylation enzymes, the glycans of the isolated glycosylatedpolypeptide can be removed by incubation with deglycosylating enzymes.Deglycosylating enzymes include PNGase F, PNGase A, EndoH(endoglycosidase H), EndoS (endoglycosidase S), EndoD (endoglycosidaseD), EndoF (endoglycosidase F), EndoF1 (endoglycosidase F1), or EndoF2(endoglycosidase F2). Protein deglycosylation mixes containing enzymessufficient for the complete removal of glycans are commerciallyavailable, e.g., from New England Biolabs. The isolated polypeptide isincubated with one or more deglycosylating enzyme under conditionssufficient for the removal of all of the glycans from the polypeptide.Other methods are known in the art for removing glycans from apolypeptide, e.g., -elimination with mild alkali or mild hydrazinolysis.Assessment of the glycosylation state of the polypeptide can bedetermined using methods for staining and visualization of glycans knownin the art, or mass spectrometry.

In yet another embodiment, an expression vector comprising a nucleicacid sequence encoding a polypeptide having biomass-degrading activitydescribed herein is expressed in a host cell, wherein the host cell iscapable of glycosylating proteins expressed within the cell. The cellsare cultured under conditions sufficient for expression of thepolypeptide, but in the presence of glycosylation inhibitors. Theglycosylation inhibitors are present at a concentration and for asufficient time such that the expressed polypeptides are aglycosylated.In this embodiment, the polypeptide is operably linked to a fusion tag.The resulting aglycosylated polypeptide can be purified or isolated fromthe bacterial host cell using affinity chromatography methods for thefusion tag as described herein.

Examples of suitable glycosylation inhibitors for use in this embodimentinclude tunicamycin, Benzyl-GalNAc (Benzyl2-acetamido-2-deoxy-α-D-galactopyranoside), 2-Fluoro-2-deoxy-D-glucose,and 5′CDP (5′ cytidylate diphosphate). In some embodiments, acombination of glycosylation inhibitors is used. Preferably, theconcentration of glycosylation inhibitors used in this embodiment issufficient to inhibit glycosylation of the polypeptide, but do not causecytotoxicity or inhibition of protein expression of the host cell.

Methods of Converting Biomass into Products

The present invention provides methods and compositions for convertingor processing a biomass into products, using an aglycosylatedpolypeptide having cellobiase activity, as described herein. Methods forconverting a biomass to products, such as sugar products, are known inthe art, for example, as described in US Patent Application2014/0011258, the contents of which are incorporated by reference in itsentirety. Briefly, a biomass is optimally pretreated, e.g., to reducethe recalcitrance, and saccharified by a saccharification process thatinvolves incubating the treated biomass with biomass-degrading, orcellulolytic, enzymes to produce sugars (e.g., glucose and/or xylose).The sugar products can then be further processed to produce a finalproduct, e.g., by fermentation or distillation. Final products includealcohols (e.g., ethanol, isobutanol, or n-butanol), sugar alcohols(e.g., erythritol, xylitol, or sorbitol), or organic acids (e.g., lacticacid, pyurvic acid, succinic acid).

Using the processes described herein, the biomass material can beconverted to one or more products, such as energy, fuels, foods andmaterials. Specific examples of products include, but are not limitedto, hydrogen, sugars (e.g., glucose, xylose, arabinose, mannose,galactose, fructose, cellobiose, disaccharides, oligosaccharides andpolysaccharides), alcohols (e.g., monohydric alcohols or dihydricalcohols, such as ethanol, n-propanol, isobutanol, sec-butanol,tert-butanol or n-butanol), hydrated or hydrous alcohols (e.g.,containing greater than 10%, 20%, 30% or even greater than 40% water),biodiesel, organic acids, hydrocarbons (e.g., methane, ethane, propane,isobutene, pentane, n-hexane, biodiesel, bio-gasoline and mixturesthereof). co-products (e.g., proteins, such as cellulolytic proteins(enzymes) or single cell proteins), and mixtures of any of these in anycombination or relative concentration, and optionally in combinationwith any additives (e.g., fuel additives). Other examples includecarboxylic acids, salts of a carboxylic acid, a mixture of carboxylicacids and salts of carboxylic acids and esters of carboxylic acids(e.g., methyl, ethyl and n-propyl esters), ketones (e.g., acetone),aldehydes (e.g., acetaldehyde), alpha and beta unsaturated acids (e.g.,acrylic acid) and olefins (e.g., ethylene). Other alcohols and alcoholderivatives include propanol, propylene glycol, 1,4-butanediol,1,3-propanediol, sugar alcohols and polyols (e.g., glycol, glycerol,erythritol, threitol, arabitol, xylitol, ribitol, mannitol, sorbitol,galactitol, iditol, inositol, volemitol, isomalt, maltitol, lactitol,maltotriitol, maltotetraitol, and polyglycitol and other polyols), andmethyl or ethyl esters of any of these alcohols. Other products includemethyl acrylate, methylmethacrylate, lactic acid, citric acid, formicacid, acetic acid, propionic acid, butyric acid, succinic acid, valericacid, caproic acid, 3-hydroxypropionic acid, palmitic acid, stearicacid, oxalic acid, malonic acid, glutaric acid, oleic acid, linoleicacid, glycolic acid, gamma-hydroxybutyric acid, and mixtures thereof,salts of any of these acids, mixtures of any of the acids and theirrespective salts.

Biomass

The biomass to be processed using the methods described herein is astarchy material and/or a cellulosic material comprising cellulose,e.g., a lignocellulosic material. The biomass may also comprisehemicellulose and/or lignin. The biomass can comprise one or more of anagricultural product or waste, a paper product or waste, a forestryproduct, or a general waste, or any combination thereof. An agriculturalproduct or waste comprises material that can be cultivated, harvested,or processed for use or consumption, e.g., by humans or animals, or anyintermediate, byproduct, or waste that is generated from thecultivation, harvest, or processing methods. Agricultural products orwaste include, but are not limited to, sugar cane, jute, hemp, flax,bamboo, sisal, alfalfa, hay, arracacha, buckwheat, banana, barley,cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams,beans, favas, lentils, peas, grasses, switchgrass, miscanthus, cordgrass, reed canary grass, grain residues, canola straw, wheat straw,barley straw, oat straw, rice straw, corn cobs, corn stover, corn fiber,coconut hair, beet pulp, bagasse, soybean stover, grain residues, ricehulls, oat hulls, wheat chaff, barley hulls, or beeswing, or acombination thereof. A paper product or waste comprises material that isused to make a paper product, any paper product, or any intermediate,byproduct or waste that is generated from making or breaking down thepaper product. Paper products or waste include, but are not limited to,paper, pigmented papers, loaded papers, coated papers, corrugated paper,filled papers, magazines, printed matter, printer paper, polycoatedpaper, cardstock, cardboard, paperboard, or paper pulp, or a combinationthereof. A forestry product or waste comprises material that is producedby cultivating, harvesting, or processing of wood, or any intermediate,byproduct, or waste that is generated from the cultivation, harvest, orprocessing of the wood. Forestry products or waste include, but are notlimited to, aspen wood, wood from any genus or species of tree, particleboard, wood chips, or sawdust, or a combination thereof. A general wasteincludes, but is not limited to, manure, sewage, or offal, or acombination thereof.

The biomass may include, but is not limited to starchy materials, sugarcane, agricultural waste, paper, paper products, paper waste, paperpulp, pigmented papers, loaded papers, coated papers, filled papers,magazines, printed matter, printer paper, polycoated paper, card stock,cardboard, paperboard, cotton, wood, particle board, forestry wastes,sawdust, aspen wood, wood chips, grasses, switchgrass, miscanthus, cordgrass, reed canary grass, grain residues, rice hulls, oat hulls, wheatchaff, barley hulls, agricultural waste, silage, canola straw, wheatstraw, barley straw, oat straw, rice straw, jute, hemp, flax, bamboo,sisal, abaca, corn cobs, corn stover, soybean stover, corn fiber,alfalfa, hay, coconut hair, sugar processing residues, bagasse, beetpulp, agave bagasse, algae, seaweed, plankton manure, sewage, offal,agricultural or industrial waste, arracacha, buckwheat, banana, barley,cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams,beans, favas, lentils, peas, or mixtures of any of these. In a preferredembodiment, the biomass comprises agriculture waste, such as corn cobs,e.g., corn stover. In another embodiment, the biomass comprises grasses.

In one embodiment, the biomass is treated prior to contact with thecompositions described herein. For example, the biomass is treated toreduce the recalcitrance of the biomass, to reduce its bulk density,and/or increase its surface area. Suitable biomass treatment process mayinclude, but are not limited to: bombardment with electrons, sonication,oxidation, pyrolysis, steam explosion, chemical treatment, mechanicaltreatment, and freeze grinding. Preferably, the treatment method isbombardment with electrons.

In some embodiments, electron bombardment is performed until the biomassreceives a total dose of at least 0.5 Mrad, e.g. at least 5, 10, 20, 30,or at least 40 Mrad. In some embodiments, the treatment is performeduntil the biomass receives a dose a of from about 0.5 Mrad to about 150Mrad, about 1 Mrad to about 100 Mrad, about 5 Mrad to about 75 Mrad,about 2 Mrad to about 75 Mrad, about 10 Mrad to about 50 Mrad, e.g.,about 5 Mrad to about 50 Mrad, about 20 Mrad to about 40 Mrad, about 10Mrad to about 35 Mrad, or from about 20 Mrad to about 30 Mrad. In someimplementations, a total dose of 25 to 35 Mrad is preferred, appliedideally over a couple of seconds, e.g., at 5 Mrad/pass with each passbeing applied for about one second. Applying a dose of greater than 7 to9 Mrad/pass can in some cases cause thermal degradation of the feedstockmaterial.

The biomass material (e.g., plant biomass, animal biomass, paper, andmunicipal waste biomass) can be used as feedstock to produce usefulintermediates and products such as organic acids, salts of organicacids, anhydrides, esters of organic acids and fuels, e.g., fuels forinternal combustion engines or feedstocks for fuel cells. Systems andprocesses are described herein that can use as feedstock cellulosicand/or lignocellulosic materials that are readily available, but oftencan be difficult to process, e.g., municipal waste streams and wastepaper streams, such as streams that include newspaper, kraft paper,corrugated paper or mixtures of these.

In order to convert the feedstock to a form that can be readilyprocessed, the glucan- or xylan-containing cellulose in the feedstockcan be hydrolyzed to low molecular weight carbohydrates, such as sugars,by a saccharifying agent, e.g., an enzyme or acid, a process referred toas saccharification. The low molecular weight carbohydrates can then beused, for example, in an existing manufacturing plant, such as a singlecell protein plant, an enzyme manufacturing plant, or a fuel plant,e.g., an ethanol manufacturing facility.

The feedstock can be hydrolyzed using an enzyme, e.g., by combining thematerials and the enzyme in a solvent, e.g., in an aqueous solution. Theenzymes can be made/induced according to the methods described herein.

Specifically, the enzymes can be supplied by organisms that are capableof breaking down biomass (such as the cellulose and/or the ligninportions of the biomass), or that contain or manufacture variouscellulolytic enzymes (cellulases), ligninases or various small moleculebiomass-degrading metabolites. These enzymes may be a complex of enzymesthat act synergistically to degrade crystalline cellulose or the ligninportions of biomass. Examples of cellulolytic enzymes include:endoglucanases, cellobiohydrolases, and cellobiases (beta-glucosidases).

During saccharification a cellulosic substrate can be initiallyhydrolyzed by endoglucanases at random locations producing oligomericintermediates. These intermediates are then substrates for exo-splittingglucanases such as cellobiohydrolase to produce cellobiose from the endsof the cellulose polymer. Cellobiose is a water-soluble 1,4-linked dimerof glucose. Finally, cellobiase cleaves cellobiose to yield glucose. Theefficiency (e.g., time to hydrolyze and/or completeness of hydrolysis)of this process depends on the recalcitrance of the cellulosic material.

Saccharification

The reduced-recalcitrance biomass is treated with the biomass-degradingenzymes discussed above, generally by combining thereduced-recalcitrance biomass and the biomass-degrading enzymes in afluid medium, e.g., an aqueous solution. In some cases, the feedstock isboiled, steeped, or cooked in hot water prior to saccharification, asdescribed in U.S. Pat. App. Pub. 2012/0100577 A1 by Medoff andMasterman, published on Apr. 26, 2012, the entire contents of which areincorporated herein.

Provided herein are mixtures of enzymes that are capable of degradingthe biomass, e.g., an enzyme mixture of biomass-degrading enzymes, foruse in the saccharification process described herein.

The saccharification process can be partially or completely performed ina tank (e.g., a tank having a volume of at least 4000 L, 40,000 L,500,000 L, 2,000,000 L, 4,000,000 L, or 6,000,000 L or more) in amanufacturing plant, and/or can be partially or completely performed intransit, e.g., in a rail car, tanker truck, or in a supertanker or thehold of a ship. The time required for complete saccharification willdepend on the process conditions and the biomass material and enzymeused. If saccharification is performed in a manufacturing plant undercontrolled conditions, the cellulose may be substantially entirelyconverted to sugar, e.g., glucose in about 12-96 hours. Ifsaccharification is performed partially or completely in transit,saccharification may take longer.

In a preferred embodiment, the saccharification reaction occurs at a pHoptimal for the enzymatic reactions to occur, e.g., at the pH optimalfor the activity of the biomass-degrading enzymes. Preferably, the pH ofthe saccharification reaction is at pH 4-4.5. In a preferred embodiment,the saccharification reaction occurs at a temperature optimal for theenzymatic reactions to occur, e.g., at the temperature optimal for theactivity of the biomass-degrading enzymes. Preferably, the temperatureof the saccharification reaction is at 42° C. −52° C.

It is generally preferred that the tank contents be mixed duringsaccharification, e.g., using jet mixing as described in InternationalApp. No. PCT/US2010/035331, filed May 18, 2010, which was published inEnglish as WO 2010/135380 and designated the United States, the fulldisclosure of which is incorporated by reference herein.

The addition of surfactants can enhance the rate of saccharification.Examples of surfactants include non-ionic surfactants, such as a Tween®20 or Tween® 80 polyethylene glycol surfactants, ionic surfactants, oramphoteric surfactants.

It is generally preferred that the concentration of the sugar solutionresulting from saccharification be relatively high, e.g., greater than5%, 7.5%, 10%, 10.5%, or greater than 40%, or greater than 50, 60, 70,or even greater than 80% by weight. Water may be removed, e.g., byevaporation, to increase the concentration of the sugar solution. Thisreduces the volume to be shipped, and also inhibits microbial growth inthe solution.

Alternatively, sugar solutions of lower concentrations may be used, inwhich case it may be desirable to add an antimicrobial additive, e.g., abroad spectrum antibiotic, in a low concentration, e.g., 50 to 150 ppm.Other suitable antibiotics include amphotericin B, ampicillin,chloramphenicol, ciprofloxacin, gentamicin, hygromycin B, kanamycin,neomycin, penicillin, puromycin, streptomycin. Antibiotics will inhibitgrowth of microorganisms during transport and storage, and can be usedat appropriate concentrations, e.g., between 15 and 10,000 ppm byweight, e.g., between 25 and 500 ppm, or between 50 and 150 ppm. Ifdesired, an antibiotic can be included even if the sugar concentrationis relatively high. Alternatively, other additives with anti-microbialof preservative properties may be used. Preferably the antimicrobialadditive(s) are food-grade.

A relatively high concentration solution can be obtained by limiting theamount of water added to the biomass material with the enzyme. Theconcentration can be controlled, e.g., by controlling how muchsaccharification takes place. For example, concentration can beincreased by adding more biomass material to the solution. In order tokeep the sugar that is being produced in solution, a surfactant can beadded, e.g., one of those discussed above. Solubility can also beincreased by increasing the temperature of the solution. For example,the solution can be maintained at a temperature of 40-50° C., 60-80° C.,or even higher.

In the processes described herein, for example after saccharification,sugars (e.g., glucose and xylose) can be isolated. For example, sugarscan be isolated by precipitation, crystallization, chromatography (e.g.,simulated moving bed chromatography, high pressure chromatography),centrifugation. extraction, any other isolation method known in the art,and combinations thereof.

Mixtures for Use in Saccharification

In an aspect, the present invention features a mixture for use in asaccharification comprising a polypeptide having biomass-degradingactivity that has been solubilized from an inclusion body, as describedherein, one or more proteins associated with an inclusion body, and asolubilizing agent. In an embodiment, the mixture may contain othercomponents of the inclusion body, such as other proteins endogenous tothe host cell, ribosomal components, nucleic acids (e.g., RNA and/orDNA), and cellular debris (e.g., cell wall debris, lipids, metabolites).The polypeptide having biomass-degrading activity can be glycosylated oraglycosylated.

In one embodiment, the mixture comprises a polypeptide having cellobiaseactivity and urea. In an embodiment, the urea is present at 0.2-6M. Inone embodiment, the mixture comprises a Cel3a from T. reesei, or afunctional variant or a fragment thereof. In one embodiment, the mixturecomprises a polypeptide comprising at least 50%, 55%, 60%, 65%, 70%,75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identity to a Cel3a from T. reesei. In another embodiment, the mixturecomprises a polypeptide comprising at least 50%, 55%, 60%, 65%, 70%,75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identity to SEQ ID NO: 1. The polypeptide having cellobiase activity canbe glycosylated or aglycosylated.

In embodiments, the mixture described herein further comprises at leastone additional enzyme derived from a microorganism, wherein theadditional enzyme has biomass or cellulose-based material-degradingactivity. For example, the additional enzyme is a ligninase, anendoglucanase, a cellobiohydrolase, a xylanase, or a cellobiase. In anembodiment, the mixture further comprises one or more ligninase, one ormore endogluconase, one or more cellobiohydrolase, one or more xylanase,or one or more cellobiase. In embodiments, the additionalbiomass-degrading enzyme is glycosylated. In embodiments, the enzymemixture further comprises at least 2, at least 3, at least 4, at least5, at least 6, at least 7, at least 8, at least 9, at least 10, or atleast 20 or more additional biomass-degrading enzymes described herein.Typical primary amino acid sequences for several biomass-degradingenzymes are shown below.

For example, the mixture further comprises a mixture of additionalbiomass-degrading enzymes produced by a microorganism, e.g., a fungalcell, such as wild-type T. reesei, or a mutant thereof, e.g., T. ReeseiRUTC30. In an embodiment, the additional biomass-degrading enzymes areisolated from the microorganisms. In an embodiment, the mixturecomprises one or more of the following biomass-degrading enzymes:B2AF03, CIP1, CIP2, Cel1a, Cel3a, Cel5a, Cel6a, Cel7a, Cel7b, Cel12a,Cel45a, Cel74a, paMan5a, paMan26a, or Swollenin, or any combinationthereof. The additional biomass-degrading enzymes, e.g., listed above,can be endogenously expressed and isolated from the microorganism, e.g.,fungal cell, from which the enzyme originates from (listed below inTable 1). Alternatively, the additional biomass-degrading enzymes, e.g.,listed above, can be heterologously expressed using similar methods ofexpression in a host cell described herein, and isolated from the hostcells. In an embodiment, the heterologously expressed additionalbiomass-degrading enzymes are tagged with a His tag at the C or Nterminus of the enzyme and are isolated using nickel affinitychromatography techniques known in the art. For example, the additionalbiomass-degrading enzymes are selected from Table 1 below.

TABLE 1 Examples of Additional Biomass-Degrading Enzymes MW, no ProteinkDa AA's th. pI no. Cysteines Organism B2AF03 87.1 800 5.94 10 Podosporaanserina CIP1 32.9 316 4.93 8 Trichoderma reesei CIP2 48.2 460 7.0 12Trichoderma reesei Cel1a 52.2 466 5.3 5 Trichoderma reesei Cel3a 78.4744 6.3 6 Trichoderma reesei Cel5a 44.1 418 4.9 12 Trichoderma reeseiCel6a 49.6 471 5.1 12 Trichoderma reesei Cel7a 54.1 514 4.6 24Trichoderma reesei Cel7b 48.2 459 4.7 22 Trichoderma reesei Cel12a 25.1234 6.6 2 Trichoderma reesei Cel45a 24.4 242 4.2 16 Trichoderma reeseiCel74a 87.1 838 5.4 4 Trichoderma reesei paMan5a 41.1 373 7.0 6Podospora anserina paMan26a 51.7 469 4.7 1 Podospora anserina Swollenin51.5 493 4.8 28 Trichoderma reesei

The amino acid sequences for the biomass-degrading enzymes listed inTable 1 are provided below.

B2AF03 (Podospora anserina) (SEQ ID NO: 6)MKSSVFWGASLTSAVVRAIDLPFQFYPNCVDDLLSTNQVCNTTLSPPERAAALVAALTPEEKLQNIVSKSLGAPRIGLPAYNWWSEALHGVAYAPGTQFWQGDGPFNSSTSFPMPLLMAATFDDELLEKIAEVIGIEGRAFGNAGFSGLDYWTPNVNPFKDPRWGRGSETPGEDVLLVKRYAAAMIKGLEGPVPEKERRVVATCKHYAANDFEDWNGATRHNFNAKISLQDMAEYYFMPFQQCVRDSRVGSIMCAYNAVNGVPSCASPYLLQTILREHWNWTEHNNYITSDCEAVLDVSLNHKYAATNAEGTAISFEAGMDTSCEYEGSSDIPGAWSQGLLKESTVDRALLRLYEGIVRAGYFDGKQSLYSSLGWADVNKPSAQKLSLQAAVDGTVLLKNDGTLPLSDLLDKSRPKKVAMIGFWSDAKDKLRGGYSGTAAYLHTPAYAASQLGIPFSTASGPILHSDLASNQSWTDNAMAAAKDADYILYFGGIDTSAAGETKDRYDLDWPGAQLSLINLLTTLSKPLIVLQMGDQLDNTPLLSNPKINAILWANWPGQDGGTAVMELVTGLKSPAGRLPVTQYPSNFTELVPMTDMALRPSAGNSQLGRTYRWYKTPVQAFGFGLHYTTFSPKFGKKFPAVIDVDEVLEGCDDKYLDTCPLPDLPVVVENRGNRTSDYVALAFVSAPGVGPGPWPIKTLGAFTRLRGVKGGEKREGGLKWNLGNLARHDEEGNTVVYPGKYEVSLDEPPKARLRFEIVRGGKGKGKVKGKGKAAQKGGVVLDRWPKPPKGQEPPAIERV C1P1 (Trichoderma reesei)(SEQ ID NO: 7)MVRRTALLALGALSTLSMAQISDDFESGWDQTKWPISAPDCNQGGTVSLDTTVAHSGSNSMKVVGGPNGYCGHIFFGTTQVPTGDVYVRAWIRLQTALGSNHVTFIIMPDTAQGGKHLRIGGQSQVLDYNRESDDATLPDLSPNGIASTVTLPTGAFQCFEYHLGTDGTIETWLNGSLIPGMTVGPGVDNPNDAGWTRASYIPEITGVNFGWEAYSGDVNTVWFDDISIASTRVGCGPGSPGGPGSSTTGRSSTSGPTSTSRPSTTIPPPTSRTTTATGPTQTHYGQCGGIGYSGPTVCASGTTCQVLNPYYSQCL C1P2 (Trichoderma reesei)(SEQ ID NO: 8)MASRFFALLLLAIPIQAQSPVWGQCGGIGWSGPTTCVGGATCVSYNPYYSQCIPSTQASSSIASTTLVTSFTTTTATRTSASTPPASSTGAGGATCSALPGSITLRSNAKLNDLFTMFNGDKVTTKDKFSCRQAEMSELIQRYELGTLPGRPSTLTASFSGNTLTINCGEAGKSISFTVTITYPSSGTAPYPAIIGYGGGSLPAPAGVAMINFNNDNIAAQVNTGSRGQGKFYDLYGSSHSAGAMTAWAWGVSRVIDALELVPGARIDTTKIGVTGCSRNGKGAMVAGAFEKRIVLTLPQESGAGGSACWRISDYLKSQGANIQTASEIIGEDPWFSTTFNSYVNQVPVLPFDHHSLAALIAPRGLFVIDNNIDWLGPQSCFGCMTAAHMAWQALGVSDHMGYSQIGAHAHCAFPSNQQSQLTAFVQKFLLGQSTNTAIFQSDFSANQSQWIDWTTPTLSCel1a (Trichoderma reesei) (SEQ ID NO: 9)MLPKDFQWGFATAAYQIEGAVDQDGRGPSIWDTFCAQPGKIADGSSGVTACDSYNRTAEDIALLKSLGAKSYRFSISWSRIIPEGGRGDAVNQAGIDHYVKFVDDLLDAGITPFITLFHWDLPEGLHQRYGGLLNRTEFPLDFENYARVMFRALPKVRNWITFNEPLCSAIPGYGSGTFAPGRQSTSEPWTVGHNILVAHGRAVKAYRDDFKPASGDGQIGIVLNGDFTYPWDAADPADKEAAERRLEFFTAWFADPIYLGDYPASMRKQLGDRLPTFTPEERALVHGSNDFYGMNHYTSNYIRHRSSPASADDTVGNVDVLFTNKQGNCIGPETQSPWLRPCAAGFRDFLVWISKRYGYPPIYVTENGTSIKGESDLPKEKILEDDFRVKYYNEYIRAMVTAVELDGVNVKGYFAWSLMDNFEWADGYVTRFGVTYVDYENGQKRFPKKSAKSLKPLFDELIAAACel3a (Trichoderma reesei) (SEQ ID NO: 10)MRYRTAAALALATGPFARADSHSTSGASAEAVVPPAGTPWGTAYDKAKAALAKLNLQDKVGIVSGVGWNGGPCVGNTSPASKISYPSLCLQDGPLGVRYSTGSTAFTPGVQAASTWDVNLIRERGQFIGEEVKASGIHVILGPVAGPLGKTPQGGRNWEGFGVDPYLTGIAMGQTINGIQSVGVQATAKHYILNEQELNRETISSNPDDRTLHELYTWPFADAVQANVASVMCSYNKVNTTWACEDQYTLQTVLKDQLGFPGYVMTDWNAQHTTVQSANSGLDMSMPGTDFNGNNRLWGPALTNAVNSNQVPTSRVDDMVTRILAAWYLTGQDQAGYPSFNISRNVQGNHKTNVRAIARDGIVLLKNDANILPLKKPASIAVVGSAAIIGNHARNSPSCNDKGCDDGALGMGWGSGAVNYPYFVAPYDAINTRASSQGTQVTLSNTDNTSSGASAARGKDVAIVFITADSGEGYITVEGNAGDRNNLDPWHNGNALVQAVAGANSNVIVVVHSVGAIILEQILALPQVKAVVWAGLPSQESGNALVDVLWGDVSPSGKLVYTIAKSPNDYNTRIVSGGSDSFSEGLFIDYKHFDDANITPRYEFGYGLSYTKFNYSRLSVLSTAKSGPATGAVVPGGPSDLFQNVATVTVDIANSGQVTGAEVAQLYITYPSSAPRTPPKQLRGFAKLNLTPGQSGTATFNIRRRDLSYWDTASQKWVVPSGSFGISVGASSRDIRLTSTLSVACel5a (Trichoderma reesei) (SEQ ID NO: 11)MNKSVAPLLLAASILYGGAAAQQTVWGQCGGIGWSGPTNCAPGSACSTLNPYYAQCIPGATTITTSTRPPSGPTTTTRATSTSSSTPPTSSGVRFAGVNIAGFDFGCTTDGTCVTSKVYPPLKNFTGSNNYPDGIGQMQHFVNDDGMTIFRLPVGWQYLVNNNLGGNLDSTSISKYDQLVQGCLSLGAYCIVDIHNYARWNGGIIGQGGPTNAQFTSLWSQLASKYASQSRVWFGIMNEPHDVNINTWAATVQEVVTAIRNAGATSQFISLPGNDWQSAGAFISDGSAAALSQVTNPDGSTTNLIFDVHKYLDSDNSGTHAECTTNNIDGAFSPLATWLRQNNRQAILTETGGGNVQSCIQDMCQQIQYLNQNSDVYLGYVGWGAGSFDSTYVLTETPTGSGNSWTDTSLVSSCLARK Cel6a (Trichoderma reesei) (SEQ ID NO: 12)MIVGILTTLATLATLAASVPLEERQACSSVWGQCGGQNWSGPTCCASGSTCVYSNDYYSQCLPGAASSSSSTRAASTTSRVSPTTSRSSSATPPPGSTTTRVPPVGSGTATYSGNPFVGVTPWANAYYASEVSSLAIPSLTGAMATAAAAVAKVPSFMWLDTLDKTPLMEQTLADIRTANKNGGNYAGQFVVYDLPDRDCAALASNGEYSIADGGVAKYKNYIDTIRQIVVEYSDIRTLLVIEPDSLANLVTNLGTPKCANAQSAYLECINYAVTQLNLPNVAMYLDAGHAGWLGWPANQDPAAQLFANVYKNASSPRALRGLATNVANYNGWNITSPPSYTQGNAVYNEKLYIHAIGPLLANHGWSNAFFITDQGRSGKQPTGQQQWGDWCNVIGTGFGIRPSANTGDSLLDSFVWVKPGGECDGTSDSSAPRFDSHCALPDALQPAPQAGAWFQAYFVQLLTNANPSFLCel7a (Trichoderma reesei) (SEQ ID NO: 13)MYRKLAVISAFLATARAQSACTLQSETHPPLTWQKCSSGGTCTQQTGSVVIDANWRWTHATNSSTNCYDGNTWSSTLCPDNETCAKNCCLDGAAYASTYGVTTSGNSLSIGFVTQSAQKNVGARLYLMASDTTYQEFTLLGNEFSFDVDVSQLPCGLNGALYFVSMDADGGVSKYPTNTAGAKYGTGYCDSQCPRDLKFINGQANVEGWEPSSNNANTGIGGHGSCCSEMDIWEANSISEALTPHPCTTVGQEICEGDGCGGTYSDNRYGGTCDPDGCDWNPYRLGNTSFYGPGSSFTLDTTKKLTVVTQFETSGAINRYYVQNGVTFQQPNAELGSYSGNELNDDYCTAEEAEFGGSSFSDKGGLTQFKKATSGGMVLVMSLWDDYYANMLWLDSTYPTNETSSTPGAVRGSCSTSSGVPAQVESQSPNAKVTFSNIKFGPIGSTGNPSGGNPPGGNPPGTTTTRRPATTTGSSPGPTQSHYGQCGGIGYSGPTVCASGTTCQVLNPYYSQCL Cel7b (Trichoderma reesei)(SEQ ID NO: 14)MAPSVTLPLTTAILAIARLVAAQQPGTSTPEVHPKLTTYKCTKSGGCVAQDTSVVLDWNYRWMHDANYNSCTVNGGVNTTLCPDEATCGKNCFIEGVDYAASGVTTSGSSLTMNQYMPSSSGGYSSVSPRLYLLDSDGEYVMLKLNGQELSFDVDLSALPCGENGSLYLSQMDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSHQGFCCNEMDILEGNSRANALTPHSCTATACDSAGCGFNPYGSGYKSYYGPGDTVDTSKTFTIITQFNTDNGSPSGNLVSITRKYQQNGVDIPSAQPGGDTISSCPSASAYGGLATMGKALSSGMVLVFSIWNDNSQYMNWLDSGNAGPCSSTEGNPSNILANNPNTHVVFSNIRWGDIGSTTNSTAPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYSGCKTCTSGTTCQYSNDYYSQCLCel12a (Trichoderma reesei) (SEQ ID NO: 15)MKFLQVLPALIPAALAQTSCDQWATFTGNGYTVSNNLWGASAGSGFGCVTAVSLSGGASWHADWQWSGGQNNVKSYQNSQIAIPQKRTVNSISSMPTTASWSYSGSNIRANVAYDLFTAANPNHVTYSGDYELMIWLGKYGDIGPIGSSQGTVNVGGQSWTLYYGYNGAMQVYSFVAQTNTTNYSGDVKNFFNYLRDNKGYNAAGQYVLSYQFGTEPFTGSGTLNVASWTASIN Cel45a (Trichoderma reesei) (SEQ ID NO: 16)MKATLVLGSLIVGAVSAYKATTTRYYDGQEGACGCGSSSGAFPWQLGIGNGVYTAAGSQALFDTAGASWCGAGCGKCYQLTSTGQAPCSSCGTGGAAGQSIIVMVTNLCPNNGNAQWCPVVGGTNQYGYSYHFDIMAQNEIFGDNVVVDFEPIACPGQAASDWGTCLCVGQQETDPTPVLGNDTGSTPPGSSPPATSSSPPSGGGQQTLYGQCGGAGWTGPTTCQAPGTCKVQNQWYSQCLP Cel74a (Trichoderma reesei)(SEQ ID NO: 17)MKVSRVLALVLGAVIPAHAAFSWKNVKLGGGGGFVPGIIFHPKTKGVAYARTDIGGLYRLNADDSWTAVTDGIADNAGWHNWGIDAVALDPQDDQKVYAAVGMYTNSWDPSNGAIIRSSDRGATWSFTNLPFKVGGNMPGRGAGERLAVDPANSNIIYFGARSGNGLWKSTDGGVTFSKVSSFTATGTYIPDPSDSNGYNSDKQGLMWVTFDSTSSTTGGATSRIFVGTADNITASVYVSTNAGSTWSAVPGQPGKYFPHKAKLQPAEKALYLTYSDGTGPYDGTLGSVWRYDIAGGTWKDITPVSGSDLYFGFGGLGLDLQKPGTLVVASLNSWWPDAQLFRSTDSGTTWSPIWAWASYPTETYYYSISTPKAPWIKNNFIDVTSESPSDGLIKRLGWMIESLEIDPTDSNHWLYGTGMTIFGGHDLTNWDTRHNVSIQSLADGIEEFSVQDLASAPGGSELLAAVGDDNGFTFASRNDLGTSPQTVWATPTWATSTSVDYAGNSVKSVVRVGNTAGTQQVAISSDGGATWSIDYAADTSMNGGTVAYSADGDTILWSTASSGVQRSQFQGSFASVSSLPAGAVIASDKKTNSVFYAGSGSTFYVSKDTGSSFTRGPKLGSAGTIRDIAAHPTTAGTLYVSTDVGIFRSTDSGTTFGQVSTALTNTYQIALGVGSGSNWNLYAFGTGPSGARLYASGDSGASWTDIQGSQGFGSIDSTKVAGSGSTAGQVYVGTNGRGVFYAQGTVGGGTGGTSSSTKQSSSSTSSASSSTTLRSSVVSTTRASTVTSSRTSSAAGPTGSGVAGHYAQCGGIGWTGPTQCVAPYVCQKQNDYYYQCV paMan5a (Podospora anserina) (SEQ ID NO: 18)MKGLFAFGLGLLSLVNALPQAQGGGAAASAKVSGTRFVIDGKTGYFAGTNSYWIGFLTNNRDVDTTLDHIASSGLKILRVWGFNDVNNQPSGNTVWFQRLASSGSQINTGPNGLQRLDYLVRSAETRGIKLIIALVNYWDDFGGMKAYVNAFGGTKESWYTNARAQEQYKRYIQAVVSRYVNSPAIFAWELANEPRCKGCNTNVIFNWATQISDYIRSLDKDHLITLGDEGFGLPGQTTYPYQYGEGTDFVKNLQIKNLDFGTFHMYPGHWGVPTSFGPGWIKDHAAACRAAGKPCLLEEYGYESDRCNVQKGWQQASRELSRDGMSGDLFWQWGDQLSTGQTHNDGFTIYYGSSLATCLVTDHVRAINALPA paMan26a (Podospora anserina)(SEQ ID NO: 19)MVKLLDIGLFALALASSAVAKPCKPRDGPVTYEAEDAILTGTTVDTAQVGYTGRGYVTGFDEGSDKITFQISSATTKLYDLSIRYAAIYGDKRTNVVLNNGAVSEVFFPAGDSFTSVAAGQVLLNAGQNTIDIVNNWGWYLIDSITLTPSAPRPPHDINPNLNNPNADTNAKKLYSYLRSVYGNKIISGQQELHHAEWIRQQTGKTPALVAVDLMDYSPSRVERGTTSHAVEDAIAHHNAGGIVSVLWHWNAPVGLYDTEENKWWSGFYTRATDFDIAATLANPQGANYTLLIRDIDAIAVQLKRLEAAGVPVLWRPLHEAEGGWFWWGAKGPEPAKQLWDILYERLTVHHGLDNLIWVWNSILEDWYPGDDTVDILSADVYAQGNGPMSTQYNELIALGRDKKMIAAAEVGAAPLPGLLQAYQANWLWFAVWGDDFINNPSWNTVAVLNEIYNSDYVLTLDEIQGWRSSwollenin (Trichoderma reesei) (SEQ ID NO: 20)MAGKLILVALASLVSLSIQQNCAALFGQCGGIGWSGTTCCVAGAQCSFVNDWYSQCLASTGGNPPNGTTSSSLVSRTSSASSSVGSSSPGGNSPTGSASTYTTTDTATVAPHSQSPYPSIAASSCGSWTLVDNVCCPSYCANDDTSESCSGCGTCTTPPSADCKSGTMYPEVHHVSSNESWHYSRSTHFGLTSGGACGFGLYGLCTKGSVTASWTDPMLGATCDAFCTAYPLLCKDPTGTTLRGNFAAPNGDYYTQFWSSLPGALDNYLSCGECIELIQTKPDGTDYAVGEAGYTDPITLEIVDSCPCSANSKWCCGPGADHCGEIDFKYGCPLPADSIHLDLSDIAMGRLQGNGSLTNGVIPTRYRRVQCPKVGNAYIWLRNGGGPYYFALTAVNTNGPGSVTKIEIKGADTDNWVALVHDPNYTSSRPQERYGSWVIPQGSGPFNLPVGIRLTSPTGEQIVNEQAIKTFTPPATGDPNFYYIDIGVQFSQN

Other examples of suitable biomass-degrading enzymes for use in theenzyme mixture of the present invention include the enzymes from speciesin the genera Bacillus, Coprinus, Myceliophthora, Cephalosporium,Scytalidium, Penicillium, Aspergillus, Pseudomonas, Humicola, Fusarium,Thielavia, Acremonium, Chrysosporium and Trichoderma, especially thoseproduced by a strain selected from the species Aspergillus (see, e.g.,EP Pub. No. 0 458 162), Humicola insolens (reclassified as Scytalidiumthermophilum, see, e.g., U.S. Pat. No. 4,435,307), Coprinus cinereus,Fusarium oxysporum, Myceliophthora thermophila, Meripilus giganteus,Thielavia terrestris, Acremonium sp. (including, but not limited to, A.persicinum, A. acremonium, A. brachypenium, A. dichromosporum, A.obclavatum, A. pinkertoniae, A. roseogriseum, A. incoloratum, and A.furatum). Preferred strains include Humicola insolens DSM 1800, Fusariumoxysporum DSM 2672, Myceliophthora thermophila CBS 117.65,Cephalosporium sp. RYM-202, Acremonium sp. CBS 478.94, Acremonium sp.CBS 265.95, Acremonium persicinum CBS 169.65, Acremonium acremonium AHU9519, Cephalosporium sp. CBS 535.71, Acremonium brachypenium CBS 866.73,Acremonium dichromosporum CBS 683.73, Acremonium obclavatum CBS 311.74,Acremonium pinkertoniae CBS 157.70, Acremonium roseogriseum CBS 134.56,Acremonium incoloratum CBS 146.62, and Acremonium furatum CBS 299.70H.Biomass-degrading enzymes may also be obtained from Chrysosporium,preferably a strain of Chrysosporium lucknowense. Additional strainsthat can be used include, but are not limited to, Trichoderma(particularly T. viride, T. reesei, and T. koningii), alkalophilicBacillus (see, for example, U.S. Pat. No. 3,844,890 and EP Pub. No. 0458 162), and Streptomyces (see, e.g., EP Pub. No. 0 458 162).

In embodiments, the microorganism is induced to produce thebiomass-degrading enzymes described herein under conditions suitable forincreasing production of biomass-degrading enzymes compared to anuninduced microorganism. For example, an induction biomass samplecomprising biomass as described herein is incubated with themicroorganism to increase production of the biomass-degrading enzymes.Further description of the induction process can be found in US2014/0011258, the contents of which are hereby incorporated by referencein its entirety.

The biomass-degrading enzymes produced and/or secreted by theaforementioned microorganisms can be isolated and added to the mixtureof the present invention, or directly to the saccharification reaction.Alternatively, in one embodiment, the aforementioned microorganisms orhost cells expressing the biomass-degrading enzymes described herein andabove are not lysed before addition to the saccharification reaction.

In an embodiment, an enzyme mixture comprising the host cell expressingone or more additional biomass-degrading enzymes as described herein canbe used with the mixture comprising the solubilized polypeptide havingbiomass-degrading activity described herein.

Use of the mixture described herein comprising a polypeptide havingbiomass-degrading activity solubilized from inclusion bodies and asolubilizing agent does not inhibit, prevent or decrease the yield ofsugar products from saccharification compared to saccharificationwithout addition of the solubilized polypeptide. In some embodiments,the yield of sugar products increases upon use of the mixture describedherein comprising a polypeptide having biomass-degrading activitysolubilized from inclusion bodies and a solubilizing agent. The yield ofsugar products increases at least 5%, at least 10%, at least 15%, atleast 20%, at least 25%, at least 30%, at least 35%, at least 40%, atleast 45%, at least 50%, at least 60%, at least 70%, at least 80%, atleast 90%, at least 100% compared to when the standard mixture ofbiomass-degrading enzymes is added to the saccharification without themixture containing solubilized polypeptide and solubilized agent.

Further Processing

Further processing steps may be performed on the sugars produced bysaccharification to produce alternative products. For example, thesugars can be hydrogenated, fermented, or treated with other chemicalsto produce other products.

Glucose can be hydrogenated to sorbitol. Xylose can be hydrogenated toxylitol. Hydrogenation can be accomplished by use of a catalyst (e.g.,Pt/gamma-Al₂O₃, Ru/C, Raney Nickel, or other catalysts know in the art)in combination with H₂ under high pressure (e.g., 10 to 12000 psi). Thesorbitol and/or xylitol products can be isolated and purified usingmethods known in the art.

Sugar products from saccharification can also be fermented to producealcohols, sugar alcohols, such as erythritol, or organic acids, e.g.,lactic, glutamic or citric acids or amino acids.

Yeast and Zymomonas bacteria, for example, can be used for fermentationor conversion of sugar(s) to alcohol(s). Other microorganisms arediscussed below. The optimum pH for fermentations is about pH 4 to 7.For example, the optimum pH for yeast is from about pH 4 to 5, while theoptimum pH for Zymomonas is from about pH 5 to 6. Typical fermentationtimes are about 24 to 168 hours (e.g., 24 to 96 hrs) with temperaturesin the range of 20° C. to 40° C. (e.g., 26° C. to 40° C.), howeverthermophilic microorganisms prefer higher temperatures.

In some embodiments, e.g., when anaerobic organisms are used, at least aportion of the fermentation is conducted in the absence of oxygen, e.g.,under a blanket of an inert gas such as N₂, Ar, He, CO₂ or mixturesthereof. Additionally, the mixture may have a constant purge of an inertgas flowing through the tank during part of or all of the fermentation.In some cases, anaerobic conditions can be achieved or maintained bycarbon dioxide production during the fermentation and no additionalinert gas is needed.

In some embodiments, all or a portion of the fermentation process can beinterrupted before the low molecular weight sugar is completelyconverted to a product (e.g., ethanol). The intermediate fermentationproducts include sugar and carbohydrates in high concentrations. Thesugars and carbohydrates can be isolated via any means known in the art.These intermediate fermentation products can be used in preparation offood for human or animal consumption. Additionally or alternatively, theintermediate fermentation products can be ground to a fine particle sizein a stainless-steel laboratory mill to produce a flour-like substance.

Jet mixing may be used during fermentation, and in some casessaccharification and fermentation are performed in the same tank.

Nutrients for the microorganisms may be added during saccharificationand/or fermentation, for example the food-based nutrient packagesdescribed in U.S. Pat. App. Pub. 2012/0052536, filed Jul. 15, 2011, thecomplete disclosure of which is incorporated herein by reference.

“Fermentation” includes the methods and products that are disclosed inU.S. Prov. App. No. 61/579,559, filed Dec. 22, 2012, and U.S. Prov. App.No. 61/579,576, filed Dec. 22, 2012, the contents of both of which areincorporated by reference herein in their entirety.

Mobile fermenters can be utilized, as described in International App.No. PCT/US2007/074028 (which was filed Jul. 20, 2007, was published inEnglish as WO 2008/011598 and designated the United States), thecontents of which is incorporated herein in its entirety. Similarly, thesaccharification equipment can be mobile. Further, saccharificationand/or fermentation may be performed in part or entirely during transit.

The microorganism(s) used in fermentation can be naturally-occurringmicroorganisms and/or engineered microorganisms. For example, themicroorganism can be a bacterium (including, but not limited to, e.g., acellulolytic bacterium), a fungus, (including, but not limited to, e.g.,a yeast), a plant, a protist, e.g., a protozoa or a fungus-like protest(including, but not limited to, e.g., a slime mold), or an algae. Whenthe organisms are compatible, mixtures of organisms can be utilized.

Suitable fermenting microorganisms have the ability to convertcarbohydrates, such as glucose, fructose, xylose, arabinose, mannose,galactose, oligosaccharides or polysaccharides into fermentationproducts. Fermenting microorganisms include strains of the genusSaccharomyces spp. (including, but not limited to, S. cerevisiae(baker's yeast), S. distaticus, S. uvarum), the genus Kluyveromyces,(including, but not limited to, K. marxianus, K. fragilis), the genusCandida (including, but not limited to, C. pseudotropicalis, and C.brassicae), Pichia stipitis (a relative of Candida shehatae), the genusClavispora (including, but not limited to, C. lusitaniae and C.opuntiae), the genus Pachysolen (including, but not limited to, P.tannophilus), the genus Bretannomyces (including, but not limited to,e.g., B. clausenii (Philippidis, G. P., 1996, Cellulose bioconversiontechnology, in Handbook on Bioethanol: Production and Utilization,Wyman, C. E., ed., Taylor & Francis, Washington, D.C., 179-212)). Othersuitable microorganisms include, for example, Zymomonas mobilis,Clostridium spp. (including, but not limited to, C. thermocellum(Philippidis, 1996, supra), C. saccharobutylacetonicum, C.saccharobutylicum, C. Puniceum, C. beijernckii, and C. acetobutylicum),Moniliella pollinis, Moniliella megachiliensis, Lactobacillus spp.Yarrowia lipolytica, Aureobasidium sp., Trichosporonoides sp.,Trigonopsis variabilis, Trichosporon sp., Moniliellaacetoabutans sp.,Typhula variabilis, Candida magnoliae, Ustilaginomycetes sp., Pseudozymatsukubaensis, yeast species of genera Zygosaccharomyces, Debaryomyces,Hansenula and Pichia, and fungi of the dematioid genus Torula.

For instance, Clostridium spp. can be used to produce ethanol, butanol,butyric acid, acetic acid, and acetone. Lactobacillus spp. can be usedto produce lactic acid.

Many such microbial strains are publicly available, either commerciallyor through depositories such as the ATCC (American Type CultureCollection, Manassas, Va., USA), the NRRL (Agricultural Research SeviceCulture Collection, Peoria, Ill., USA), or the DSMZ (Deutsche Sammlungvon Mikroorganismen and Zellkulturen GmbH, Braunschweig, Germany), toname a few.

Commercially available yeasts include, for example, Red Star®/LesaffreEthanol Red (available from Red Star/Lesaffre, USA), FALI® (availablefrom Fleischmann's Yeast, a division of Burns Philip Food Inc., USA),SUPERSTART® (available from Alltech, now Lalemand), GERT STRAND®(available from Gert Strand AB, Sweden) and FERMOL® (available from DSMSpecialties).

Many microorganisms that can be used to saccharify biomass material andproduce sugars can also be used to ferment and convert those sugars touseful products.

After fermentation, the resulting fluids can be distilled using, forexample, a “beer column” to separate ethanol and other alcohols from themajority of water and residual solids. The vapor exiting the beer columncan be, e.g., 35% by weight ethanol and can be fed to a rectificationcolumn. A mixture of nearly azeotropic (92.5%) ethanol and water fromthe rectification column can be purified to pure (99.5%) ethanol usingvapor-phase molecular sieves. The beer column bottoms can be sent to thefirst effect of a three-effect evaporator. The rectification columnreflux condenser can provide heat for this first effect. After the firsteffect, solids can be separated using a centrifuge and dried in a rotarydryer. A portion (25%) of the centrifuge effluent can be recycled tofermentation and the rest sent to the second and third evaporatoreffects. Most of the evaporator condensate can be returned to theprocess as fairly clean condensate with a small portion split off towaste water treatment to prevent build-up of low-boiling compounds.

Other types of chemical transformation of the products from theprocesses described herein can be used, for example, production oforganic sugar derived products such (e.g., furfural and furfural-derivedproducts). Chemical transformations of sugar derived products aredescribed in U.S. Prov. App. No. 61/667,481, filed Jul. 3, 2012, thedisclosure of which is incorporated herein by reference in its entirety.

EXAMPLES

The invention is further described in detail by reference to thefollowing experimental examples. These examples are provided forpurposes of illustration only, and are not intended to be limitingunless otherwise specified. Thus, the invention should in no way beconstrued as being limited to the following examples, but rather, shouldbe construed to encompass any and all variations which become evident asa result of the teaching provided herein.

Without further description, it is believed that one of ordinary skillin the art can, using the preceding description and the followingillustrative examples, make and utilize the compounds of the presentinvention and practice the claimed methods. The following workingexamples specifically point out various aspects of the presentinvention, and are not to be construed as limiting in any way theremainder of the disclosure.

Example 1: Expression of Cel3a-C′his in E. coli

The mature sequence for Cel3a (amino acids 20-744) was synthesized andcodon-optimized for E. coli expression by Genewiz. The Cel3a-C′Hisreferred to in the following examples refers to the codon-optimizedmature sequence for Cel3a (aas 20-744) with an 8× His (SEQ ID NO: 21)tag at the C-terminus. The below primers were used to clone theCel3a-C′His into pET-Duet (Novagen, Catalog No. 71146):

Forward (SEQ ID NO: 4) 5′-CATGCCATG GGCGATAGTCACAGTACCAGC Reverse(SEQ ID NO: 5) 3′-CCCAAGCTT TCATTA GTGATGATGATGATGATGATGATGGCTGCCGCTGCCGGCAACACTCAGGGTGC(NcoI and HindIII sites are underlined; start and stop codons are inbold; the polyhistidine (8-His (SEQ ID NO: 21) tag; and glycine-serine(GSGS (SEQ ID NO: 23)) linker are italicized.) The Amplificationreaction was performed using PfuUltra II Fusion HS Polymerase (Agilent,Catalog No. 600672).

The amplified DNA was cloned by restriction digestion using NcoIrestriction enzyme (New England Biolabs, R3193) and HindIII restrictionenzyme (New England Biolabs, R3104) under conditions suggested by themanufacturer. The digested amplified DNA was ligated into theNcoI-HindIII sites in the pETDuet vector using T4 DNA ligase (NewEngland Biolabs, M0202), followed by transformation of E. coli cloninghost Top 10 One Shot (Invitrogen). Plasmid purification was carried outusing Qiagen's plasmid purification kit.

The Cel3A-C′His constructs were transformed into the E. coli expressionhost Origami B (DE3) (EMD Millipore, Catalog No. 70837) and streaked onplates containing LB medium and 100 μg/ml ampicillin (Fisher Scientific,Catalog No. BP1760), 15 μg/ml kanamysin (Fisher Scientific, Catalog No.BP906) and 12.5 μg/ml tetracycline (Fisher Scientified, Catalog No.BP912). Colonies carrying the recombinant DNA were picked from platesfor the inoculation of 2 ml starter cultures, and grown overnight at 37°C., then subsequently used to inoculate 100 ml of LB media containingthe appropriate antibiotics. Cultures were grown at 37° C. until OD600reached 0.8.

To induce protein expression, 500 μM IPTG(Isopropyl-b-D-thiogalactopyranoside; Fisher Scientific, Catalog No.BP1755) was added. The expression culture was further grown for another4 hours at 37° C. The cells were harvested by centrifugation at 4200 atroom temperature (RT) for 30 minutes using the Sorvall St16 rotor TX400.

Example 2: Solubilization of Cel3a from the Insoluble Fraction

An E. coli culture expressing an enzyme having biomass-degradingactivity, Cel3a, was cultured and enzyme expression was induced, asdescribed in Example 1. Isolation of Cel3a tagged with a His tag at theC-terminus (Cel3a-C′His) from the soluble and insoluble fraction wasperformed as follows. The cell culture was centrifuged at 4200 rpm for30 minutes. The supernatant was discarded and the cell pellet wasre-suspended in lysis buffer with 1 mg/mL lysozyme. Lysonase, e.g., 10μl, of Lysonase Bioprocessing Reagent (EMD Millipore 71320) per gram ofcell paste was added and the sample was incubated for 1 hour at ambienttemperature. After 1 hour, the sample was sonicated for a total of 2minutes in 30 second intervals. Following sonication, the sample wascentrifuged for 30 minutes at 10000 rpm. The supernatant, or solublefraction, contains solubilized Cel3a, while the remaining pellet, orinsoluble fraction, contains inclusion bodies and insoluble Cel3a.

The insoluble fraction was re-suspended in buffer containing asolubilizing agent for 15 minutes and vortexed at room temperature,specifically, 6M Urea pH 8 IMAC binding buffer. The sample was thenfiltered through a 0.45 μm membrane to prepare for IMAC purification.The amount of 6M Urea pH 8 IMAC binding buffer added was proportional tothe amount of cell mass, e.g., 2 or 3 volumes of buffer to 1 volume ofcell mass. The amount of binding buffer is increased as the cell massincreases in order to make filtering of the sample possible.

Example 3: Purification of Cel3a

Purification of Soluble Cel3a

The soluble fraction from Example 2 was transferred to a fresh tubecontaining 100 l of pre-equilibrated Bio-Scale™ Profinity (Biorad)Ni-charged IMAC resin slurry (BioRad, Catalog No. 732-4614). The nativebinding buffer contained 50 mM Tris HCl pH 7.5, 150 mM NaCl, 0.1% TritonX-100, and 5 μM imidazole. The protein was batch-bound for 1 hour atroom temperature (RT), and then washed with native buffer containing 25μM imidazole. The protein was eluted in 300 μl of native buffercontaining 200 μM imidazole.

Purification of Insoluble Cel3a

A Bio-Scale™ Mini Profinity IMAC 5 mL cartridge (BioRad, Catalog No.732-4614) was equilibrated with 5 column volumes of 6M Urea pH8 IMACbinding buffer at a flow rate of 5 mL/min. After column equilibrationthe resuspended insoluble fraction from Example 2 was loaded at a flowrate of 1 mL/min. The column then received a 15 column volume wash ofthe 6M Urea pH8 IMAC binding buffer at a flow rate of 5 mL/min. Thesolubilized Cel3a was then eluted from the column with 10 column volumesof 6M Urea pH 4 IMAC elution buffer at a flow rate of 5 mL/min. Theresulting solubilized Cel3a sample contains 6M urea.

IMAC chromatography analysis was performed (using IMAC columns,Bio-Scale™ Mini Profinity™ IMAC Cartridges 5 mL, Catalog #732-4614), andas shown in FIG. 1, purified solubilized Cel3a was detected.

SDS-PAGE analysis was performed to assess the amount of Cel3a from thepurification described above in the following fractions: purifiedsoluble Cel3a (lane 2 in FIG. 2); flow through from the IMACpurification of the insoluble fraction (lane 3 in FIG. 2); and thepurified Cel3a from the insoluble fraction (solubilized Cel3a) (lane 4in FIG. 2). As shown in FIG. 2, Cel3a was successfully isolated from theinclusion bodies of the insoluble fraction using the methods describedabove.

Example 4: Analysis of Cellobiase Activity of Solubilized Cel3a

Cel3a was purified using IMAC techniques, as described in Example 3.Prior to performing the activity assay, the amount (titer) of purifiedCel3a was determined using Bradford assay and/or the nanodrop. Fornanodrop quantification, the molar extinction coefficient was estimatedby inserting the amino acid sequence of the target form of Cel3a intothe ExPASy ProtParam online tool.

For the activity assay, two fold serial dilutions of samples containingpurified Cel3a were prepared using 50 mM sodium citrate, pH 5.0 NaOH asbuffer. Dilutions were aliquoted across one row of a 96 well plate.Dilutions were incubated with a D-(+)-Cellobiose (Fluka) substratesolution in 50 mM sodium citrate monobasic buffer at pH 5.0, at 48° C.for 30 minutes. The plates were immediately sealed using an adhesiveplate seal and placed on a microplate incubator shaker set at 48° C.,700 rpm. After 30 minutes, the samples were heated on a heating dry bathfor 5 minutes at 100° C. to stop the reaction. The plate was thenfiltered through a 96 well format 0.45 m Durapore membrane. The filtratesamples were analysed for glucose and cellobiose using the YSIBiochemistry analyser (YSI Life Sciences) and/or HPLC (UPLC) methods.The cellobiase activity of the dilutions of purified Cel3a from thesoluble and insoluble fractions was plotted on a graph and the resultsare shown in FIG. 3. FIG. 3 shows that the solubilized Cel3a hascellobiase activity, even in the presence of urea.

Cellobiase activity was also assessed for solubilized Cel3a that has notbeen purified by the IMAC purification methods described in Example 3.The cell pellet of cells expressing Cel3a was washed with lysis bufferbefore solubilising with the solubilizing buffer containing 6M urea.Cellobiase activity of the crude lysate sample containing solubilizedCel3a in 6M urea buffer is assayed by the cellobiase assay describedabove. The percentage of cellobiose converted to glucose in 30 minuteswas compared between soluble Cel3a, the soluble wash, and thesolubilized Cel3a from crude lysate (FIG. 4). The solubilized Cel3awithout purification also possessed cellobiase activity.

EQUIVALENTS

The disclosures of each and every patent, patent application, andpublication cited herein are hereby incorporated herein by reference intheir entirety. While this invention has been disclosed with referenceto specific aspects, it is apparent that other aspects and variations ofthis invention may be devised by others skilled in the art withoutdeparting from the true spirit and scope of the invention. The appendedclaims are intended to be construed to include all such aspects andequivalent variations.

What is claimed is:
 1. A mixture comprising a plurality of polypeptideshaving biomass-degrading activity and a solubilizing agent, wherein thepolypeptides have at least 8-10% of the biomass-degrading activitycompared to a native polypeptide having biomass-degrading activity. 2.The mixture of claim 1, further comprising one or more proteinsassociated with an inclusion body.
 3. The mixture of claim 1, whereinthe mixture does not comprise one or more proteins associated with aninclusion body.
 4. The mixture of any of the preceding claims, furthercomprising cellular debris, one or more ribosomal component, one or morehost protein, and/or host nucleic acid comprising DNA and/or RNA.
 5. Themixture of any of the preceding claims, wherein the biomass-degradingactivity is cellobiase activity, ligninase activity, endoglucanaseactivity, cellobiohydrolase activity, or xylanase activity.
 6. Themixture of any of the preceding claims, wherein the polypeptide ispartially unfolded, partially misfolded, or partially denatured.
 7. Themixture of claim 1, wherein the polypeptide comprises an amino acidsequence with at least 90% identity to SEQ ID NO:
 1. 8. The mixture ofany of the preceding claims, wherein the polypeptide comprises a Cel3Aenzyme from T. reesei, or a functional variant or fragment thereof. 9.The mixture of claim 8, wherein the Cel3A enzyme comprises the aminoacid sequence SEQ ID NO: 1, or an amino acid sequence with at least 90%identity thereof.
 10. The mixture any of the preceding claims, whereinthe polypeptide is encoded by a nucleic acid sequence comprising atleast 90% identity to SEQ ID NO: 2 or SEQ ID NO:
 3. 11. The mixture ofany of the preceding claims, wherein the polypeptide is aglycosylated.12. The mixture of any of claims 1-3 or 7, wherein the solubilizingagent comprises urea, and optionally, is present at a concentrationbetween 0.2M-6M.
 13. The mixture of any of the preceding claims, furthercomprising at least one additional polypeptide having abiomass-degrading activity or a microorganism that produces one or moreenzymes having a biomass-degrading activity.
 14. The mixture of claim13, wherein the additional polypeptide is selected from a ligninase, anendoglucanase, a cellobiohydrolase, a cellobiase, and a xylanase, or anycombination thereof.
 15. The mixture of claim 13 or 14, wherein theadditional polypeptide is selected from: a. a polypeptide comprising anamino acid sequence with at least 90% identity to SEQ ID NO: 1; b. aCel3A enzyme from T. reesei, or a functional variant or fragmentthereof; or c. a polypeptide encoded by a nucleic acid sequencecomprising (e.g., consisting of) SEQ ID NO: 2 or SEQ ID NO:
 3. 16. Themixture of any of claims 13-15, wherein the additional polypeptide isaglycosylated.
 17. The mixture of any of claims 13-15, wherein theadditional polypeptide is glycosylated.
 18. A mixture comprising aplurality of polypeptides having an amino acid sequence with at least90% identity to SEQ ID NO: 1 and a solubilizing agent, wherein theplurality of polypeptides have at least 20%-40% of the activity of thenative polypeptide comprising SEQ ID NO:
 1. 19. The mixture of claim 18,further comprising one or more proteins associated with an inclusionbody.
 20. The mixture of claim 18, wherein the mixture does not compriseone or more proteins associated with an inclusion body.
 21. The mixtureof any of claims 18-20, further comprising cellular debris, one or moreribosomal component, one or more host protein, and/or host nucleic acidcomprising DNA and/or RNA.
 22. The mixture of any of claims 18-21,wherein the polypeptide is partially unfolded, partially misfolded, orpartially denatured.
 23. The mixture any of claims 18-22, wherein thepolypeptide is encoded by a nucleic acid sequence comprising at least90% identity to SEQ ID NO: 2 or SEQ ID NO:
 3. 24. The mixture of any ofclaims 18-23, wherein the polypeptide is aglycosylated.
 25. The mixtureof any of claims 18-20, wherein the solubilizing agent comprises urea,and optionally, is present at a concentration between 0.2M-6M.
 26. Themixture of any of claims 18-25, further comprising at least oneadditional polypeptide having a biomass-degrading activity or amicroorganism that produces one or more enzymes having abiomass-degrading activity.
 27. The mixture of claim 26, wherein theadditional polypeptide is selected from a ligninase, an endoglucanase, acellobiohydrolase, a cellobiase, and a xylanase, or any combinationthereof.
 28. The mixture of claim 26 or 27, wherein the additionalpolypeptide is selected from: a. a polypeptide comprising an amino acidsequence with at least 90% identity to SEQ ID NO: 1; b. a Cel3A enzymefrom T. reesei, or a functional variant or fragment thereof; or c. apolypeptide encoded by a nucleic acid sequence comprising (e.g.,consisting of) SEQ ID NO: 2 or SEQ ID NO:
 3. 29. The mixture of any ofclaims 26-28 wherein the additional polypeptide is aglycosylated. 30.The mixture of any of claims 26-28, wherein the additional polypeptideis glycosylated.
 31. A method for producing a mixture of any of claims1-30 comprising contacting a cell expressing the polypeptide havingbiomass-degrading activity, or lysate thereof, with a solubilizing agentat a concentration suitable for solubilizing the polypeptide.
 32. Themethod of claim 31, further comprising lysing the cell to obtain alysate, separating a soluble fraction from an insoluble fraction of thelysate, and resuspending the insoluble fraction in the solubilizingagent.
 33. The method of claim 31 or 32, wherein the solubilizing agentis urea, and optionally, wherein the concentration of the solubilizingagent is between 0.2M-6M.
 34. The method of any of claims 31-33, whereinthe biomass-degrading activity is a cellobiase activity, a ligninaseactivity, an endoglucanase activity, a cellobiohydrolase, or a xylanaseactivity.
 35. The method of any of claims 31-34, wherein the polypeptidecomprises an amino acid sequence with at least 90% identity to SEQ IDNO:
 1. 36. The method of any of claims 31-35, wherein the polypeptidecomprises a Cel3A from T. reesei, or a functional variant or fragmentthereof.
 37. The method of any of claims 31-36 wherein the polypeptideis aglycosylated.
 38. A method for producing a polypeptide havingbiomass-degrading activity comprising expressing the polypeptide in acell and contacting the cell or a lysate thereof with a solubilizingagent at a concentration suitable for solubilizing the polypeptide. 39.A method for producing a polypeptide having biomass-degrading activitycomprising providing a cell that has been genetically modified toproduce at least one polypeptide having biomass-degrading activity,wherein at least a portion of said polypeptide having biomass-degradingactivity is found in inclusion bodies, and contacting the cell, or alysate thereof containing the inclusion bodies, with a solubilizingagent at a concentration suitable for solubilizing the polypeptide. 40.The method of claim 38 or 39, wherein the solubilizing agent comprisesurea.
 41. The method of any of claims 38-40, wherein the concentrationof the solubilizing agent is between 0.2M-6M.
 42. The method of any ofclaims 38-41, further comprising lysing the cell to obtain a lysate,separating a soluble fraction from an insoluble fraction of the lysate,and resuspending the insoluble fraction in the solubilizing agent. 43.The method of any of claims 38-42, wherein the biomass-degradingactivity is a cellobiase activity, a ligninase activity, anendoglucanase activity, a cellobiohydrolase activity, or a xylanaseactivity.
 44. The method of any of claim 38 or 39, wherein thepolypeptide comprises an amino acid sequence with at least 90% identityto SEQ ID NO:
 1. 45. The method of any of claims 38-44, wherein thepolypeptide comprises a Cel3A from T. reesei, or a functional variant orfragment thereof.
 46. The method of any of claims 38-45, wherein thecell is a prokaryotic or bacterial cell, e.g., E. coli cell, origami E.coli cell.
 47. The method of any of claims 38-46, wherein thepolypeptide is aglycosylated.
 48. A method of producing a product from abiomass comprising contacting a biomass with the mixture of any ofclaims 1-30, and, optionally, a microorganism that produces one or morebiomass-degrading enzyme and/or an enzyme mixture comprisingbiomass-degrading enzymes, under conditions suitable for the productionof the product.
 49. The method of claim 48, further comprising treatingthe biomass with an electron beam prior to contacting the biomass withthe mixture.
 50. The method of claim 48 or 49, wherein the product is asugar product.
 51. The method of claim 50, wherein the sugar product isglucose and/or xylose.
 52. The method of any of claims 48-51, furthercomprising isolating the product.
 53. The method of claim 52, whereinthe isolating of the product comprises precipitation, crystallization,chromatography, centrifugation, and/or extraction.
 54. The method of anyof claims 48-53, wherein the enzyme mixture comprises at least two ofthe enzymes selected from B2AF03, CIP1, CIP2, Cel1a, Cel3a, Cel5a,Cel6a, Cel7a, Cel7b, Cel12a, Cel45a, Cel74a, paMan5a, paMan26a, andSwollenin.
 55. The method of any of claims 48-54, wherein the biomasscomprises one or more of an agricultural product or waste, a paperproduct or waste, a forestry product, or a general waste, or anycombination thereof, wherein: a) an agricultural product or wastecomprises sugar cane jute, hemp, flax, bamboo, sisal, alfalfa, hay,arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago,sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas,grasses, switchgrass, miscanthus, cord grass, reed canary grass, grainresidues, canola straw, wheat straw, barley straw, oat straw, ricestraw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp,bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheatchaff, barley hulls, or beeswing, or a combination thereof; b) a paperproduct or waste comprises paper, pigmented papers, loaded papers,coated papers, filled papers, magazines, printed matter, printer paper,polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or acombination thereof; c) a forestry product comprises aspen wood,particle board, wood chips, or sawdust, or a combination thereof; and d)a general waste comprises manure, sewage, or offal, or a combinationthereof.
 56. The method of any of claims 48-55, further comprises a stepof treating the biomass prior to introducing the microorganism or theenzyme mixture to reduce the recalcitrance of the biomass, wherein thetreating comprises bombardment with electrons, sonication, oxidation,pyrolysis, steam explosion, chemical treatment, mechanical treatment, orfreeze grinding.
 57. The method of any of claims 48-56, wherein themicroorganism that produces a biomass-degrading enzyme is from speciesin the genera selected from Bacillus, Coprinus, Myceliophthora,Cephalosporium, Scytalidium, Penicillium, Aspergillus, Pseudomonas,Humicola, Fusarium, Thielavia, Acremonium, Chrysosporium or Trichoderma.58. The method of any of claims 48-57, wherein the microorganism thatproduces a biomass-degrading enzyme is selected from Aspergillus,Humicola insolens (Scytalidium thermophilum) Coprinus cinereus, Fusariumoxysporum, Myceliophthora thermophila, Meripilus giganteus, Thielaviaterrestris, Acremonium persicinum, Acremonium acremonium, Acremoniumbrachypenium, Acremonium dichromosporum, Acremonium obclavatum,Acremonium pinkertoniae, Acremonium roseogriseum, Acremoniumincoloratum, Acremonium furatum, Chrysosporium lucknowense, Trichodermaviride, Trichoderma reesei, or Trichoderma koningii.
 59. The method ofany of claims 48-58, wherein the microorganism has been induced toproduce biomass-degrading enzymes by combining the microorganism with aninduction biomass sample under conditions suitable for increasingproduction of biomass-degrading enzymes compared to an uninducedmicroorganism.
 60. The method of any of claims 48-59, wherein theinduction biomass sample comprises one or more of an agriculturalproduct or waste, a paper product or waste, a forestry product, or ageneral waste, or any combination thereof, wherein: a) an agriculturalproduct or waste comprises sugar cane jute, hemp, flax, bamboo, sisal,alfalfa, hay, arracacha, buckwheat, banana, barley, cassava, kudzu, oca,sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils,peas, grasses, switchgrass, miscanthus, cord grass, reed canary grass,grain residues, canola straw, wheat straw, barley straw, oat straw, ricestraw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp,bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheatchaff, barley hulls, or beeswing, or a combination thereof; b) a paperproduct or waste comprises paper, pigmented papers, loaded papers,coated papers, filled papers, magazines, printed matter, printer paper,polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or acombination thereof; c) a forestry product comprises aspen wood,particle board, wood chips, or sawdust, or a combination thereof; and d)a general waste comprises manure, sewage, or offal, or a combinationthereof.